《计算机应用》唯一官方网站 ›› 2024, Vol. 44 ›› Issue (10): 3011-3020.DOI: 10.11772/j.issn.1001-9081.2023101475
收稿日期:
2023-11-02
修回日期:
2024-02-23
接受日期:
2024-02-26
发布日期:
2024-10-15
出版日期:
2024-10-10
通讯作者:
尹春勇
作者简介:
尹春勇(1977—),男,山东潍坊人,教授,博士生导师,博士,主要研究方向:网络空间安全、大数据挖掘及隐私保护、人工智能及新型计算 yinchunyong@hotmail.com基金资助:
Chunyong YIN1(), Yongcheng ZHOU2
Received:
2023-11-02
Revised:
2024-02-23
Accepted:
2024-02-26
Online:
2024-10-15
Published:
2024-10-10
Contact:
Chunyong YIN
About author:
ZHOU Yongcheng, born in 2000, M. S. candidate. His research interests include federated learning.
Supported by:
摘要:
联邦学习(FL)是一种分布式机器学习方法,旨在共同训练全局模型,然而全局模型难以胜任多数据分布情况。为应对多分布挑战,引入聚类联邦学习,以客户端分组方式优化共享多模型。其中,服务器端聚类难以修正分类错误,而客户端聚类则对初始模型的选择至关重要。为解决这些问题,提出自动调整聚类联邦学习(AACFL)框架,所提框架采用双端聚类整合服务器端和客户端聚类。首先用双端聚类将客户端分为可调整集群,其次自动调整局部客户端身份,最后获取正确的客户集群。在非独立同分布下,在3个经典联邦数据集上的评估实验结果表明,AACFL能够在双端聚类结果存在错误的情况下通过调整获得正确集群,当簇数为4,客户端数为100时,与联邦平均(FedAvg)算法、聚类联邦学习(CFL)和IFCA(Iterative Federated Clustering Algorithm)等方法相比,有效地提高模型收敛速度和获得正确聚类结果的速度,准确率平均提升0.20~23.16个百分点。验证了所提框架能够高效聚类,并提高模型收敛速度和准确率。
中图分类号:
尹春勇, 周永成. 双端聚类的自动调整聚类联邦学习[J]. 计算机应用, 2024, 44(10): 3011-3020.
Chunyong YIN, Yongcheng ZHOU. Automatically adjusted clustered federated learning for double-ended clustering[J]. Journal of Computer Applications, 2024, 44(10): 3011-3020.
算法 | EMNIST | CIFAR-10 | Fashion-MNIST |
---|---|---|---|
FedAvg | 70.18 | 48.93 | 81.46 |
FeSEM | 78.24 | 55.41 | 84.80 |
WeCFL | 78.46 | 55.76 | 85.10 |
IFCA | 79.15 | 57.78 | 85.79 |
CFL | 79.14 | 55.57 | 84.82 |
AACFL | 79.20 | 56.85 | 86.13 |
表1 各算法在k=2,n=50下的准确率评估 (%)
Tab. 1 Accuracy evaluation of different algorithms under k=2, n=50
算法 | EMNIST | CIFAR-10 | Fashion-MNIST |
---|---|---|---|
FedAvg | 70.18 | 48.93 | 81.46 |
FeSEM | 78.24 | 55.41 | 84.80 |
WeCFL | 78.46 | 55.76 | 85.10 |
IFCA | 79.15 | 57.78 | 85.79 |
CFL | 79.14 | 55.57 | 84.82 |
AACFL | 79.20 | 56.85 | 86.13 |
算法 | EMNIST | CIFAR-10 | Fashion-MNIST |
---|---|---|---|
FedAvg | 56.19 | 42.62 | 72.82 |
FeSEM | 74.20 | 53.12 | 82.13 |
WeCFL | 74.87 | 53.68 | 82.68 |
IFCA | 78.97 | 53.20 | 85.64 |
CFL | 78.87 | 56.67 | 85.53 |
AACFL | 79.35 | 58.29 | 85.84 |
表2 各算法在k=4, n=100下的准确率评估 (%)
Tab. 2 Accuracy evaluation of different algorithms under k=4, n=100
算法 | EMNIST | CIFAR-10 | Fashion-MNIST |
---|---|---|---|
FedAvg | 56.19 | 42.62 | 72.82 |
FeSEM | 74.20 | 53.12 | 82.13 |
WeCFL | 74.87 | 53.68 | 82.68 |
IFCA | 78.97 | 53.20 | 85.64 |
CFL | 78.87 | 56.67 | 85.53 |
AACFL | 79.35 | 58.29 | 85.84 |
学习率 | 准确率/% | ||
---|---|---|---|
EMNIST | CIFAR-10 | Fashion-MNIST | |
0.10 | 78.28 | 57.44 | 86.76 |
0.07 | 78.12 | 57.34 | 86.06 |
0.04 | 77.26 | 56.90 | 84.46 |
0.01 | 72.89 | 50.49 | 80.46 |
表3 AACFL在不同学习率下的准确率评估
Tab. 3 Accuracy evaluation of AACFL under different learning rates
学习率 | 准确率/% | ||
---|---|---|---|
EMNIST | CIFAR-10 | Fashion-MNIST | |
0.10 | 78.28 | 57.44 | 86.76 |
0.07 | 78.12 | 57.34 | 86.06 |
0.04 | 77.26 | 56.90 | 84.46 |
0.01 | 72.89 | 50.49 | 80.46 |
qκ | EMNIST | CIFAR-10 | Fashion-MNIST | |||
---|---|---|---|---|---|---|
ARI | times | ARI | times | ARI | times | |
0.4 | 0.82 | 32 | 0.79 | 12 | 0.94 | 10 |
0.6 | 0.92 | 29 | 0.89 | 16 | 0.94 | 4 |
0.8 | 0.97 | 9 | 0.87 | 20 | 0.94 | 7 |
1.0 | 0.97 | 11 | 0.97 | 5 | 0.97 | 3 |
表4 AACFL在不同聚类客户端参与率下ARI评估和轮数消耗评估
Tab. 4 ARI evaluation and round consumption evaluation of AACFL with different clustered client participation rates
qκ | EMNIST | CIFAR-10 | Fashion-MNIST | |||
---|---|---|---|---|---|---|
ARI | times | ARI | times | ARI | times | |
0.4 | 0.82 | 32 | 0.79 | 12 | 0.94 | 10 |
0.6 | 0.92 | 29 | 0.89 | 16 | 0.94 | 4 |
0.8 | 0.97 | 9 | 0.87 | 20 | 0.94 | 7 |
1.0 | 0.97 | 11 | 0.97 | 5 | 0.97 | 3 |
算法 | EMNIST | CIFAR-10 | Fashion-MNIST |
---|---|---|---|
FedAvg | 2 146 | 5 415 | 2 812 |
FeSEM | 2 567 | 5 897 | 3 125 |
WeCFL | 2 583 | 5 812 | 3 217 |
IFCA | 4 054 | 8 240 | 4 545 |
CFL | 2 348 | 5 755 | 3 064 |
AACFL | 2 787 | 5 990 | 3 364 |
表5 各算法在k=2, n=50下的时间消耗 (s)
Tab. 5 Time consumption of different algorithms under k=2, n=50
算法 | EMNIST | CIFAR-10 | Fashion-MNIST |
---|---|---|---|
FedAvg | 2 146 | 5 415 | 2 812 |
FeSEM | 2 567 | 5 897 | 3 125 |
WeCFL | 2 583 | 5 812 | 3 217 |
IFCA | 4 054 | 8 240 | 4 545 |
CFL | 2 348 | 5 755 | 3 064 |
AACFL | 2 787 | 5 990 | 3 364 |
算法 | EMNIST | CIFAR-10 | Fashion-MNIST |
---|---|---|---|
FedAvg | 5 116 | 9 214 | 5 042 |
FeSEM | 5 623 | 9 845 | 5 543 |
WeCFL | 5 711 | 9 834 | 5 604 |
IFCA | 7 132 | 14 564 | 7 465 |
CFL | 5 521 | 9 664 | 5 402 |
AACFL | 5 831 | 10 168 | 5 799 |
表6 各算法在k=4, n=100下的时间消耗 (s)
Tab. 6 Time consumption of different algorithms under k=4, n=100
算法 | EMNIST | CIFAR-10 | Fashion-MNIST |
---|---|---|---|
FedAvg | 5 116 | 9 214 | 5 042 |
FeSEM | 5 623 | 9 845 | 5 543 |
WeCFL | 5 711 | 9 834 | 5 604 |
IFCA | 7 132 | 14 564 | 7 465 |
CFL | 5 521 | 9 664 | 5 402 |
AACFL | 5 831 | 10 168 | 5 799 |
1 | McMAHAN H B, MOORE E, RAMAGE D, et al. Communication-efficient learning of deep networks from decentralized data[C]// Proceedings of the 20th International Conference on Artificial Intelligence and Statistics. New York: JMLR.org, 2017: 1273-1282. |
2 | LI X, HUANG K X, YANG W H, et al. On the convergence of FedAvg on Non-IID data[EB/OL]. (2020-06-25) [2023-09-12].. |
3 | YANG H, FANG M, LIU J. Achieving linear speedup with partial worker participation in Non-IID federated learning[EB/OL]. (2021-05-04) [2023-09-12].. |
4 | DUAN M, LIU D, CHEN X, et al. Astraea: self-balancing federated learning for improving classification accuracy of mobile deep learning applications[C]// Proceedings of the 2019 IEEE 37th International Conference on Computer Design. Piscataway: IEEE, 2019: 246-254. |
5 | 罗长银,陈学斌,刘洋,等. 基于联邦集成算法对多源数据安全性的研究[J]. 计算机工程与科学, 2021, 43(8):1387-1397. |
LUO C Y, CHEN X B, LIU Y, et al. A federated ensemble algorithm for multi-source data security[J]. Computer Engineering and Science, 2021, 43(8):1387-1397. | |
6 | HAO M, LI H, XU G, et al. Towards efficient and privacy-preserving federated deep learning[C]// Proceedings of the 2019 IEEE International Conference on Communications. Piscataway: IEEE, 2019: 1-6. |
7 | FANG C, GUO Y, WANG N, et al. Highly efficient federated learning with strong privacy preservation in cloud computing[J]. Computers and Security, 2020, 96: No.101889. |
8 | LI Q, WEN Z, WU Z, et al. A survey on federated learning systems: vision, hype and reality for data privacy and protection[J]. IEEE Transactions on Knowledge and Data Engineering, 2021, 35(4): 3347-3366. |
9 | HARD A, RAO K, MATHEWS R, et al. Federated learning for mobile keyboard prediction[EB/OL]. (2019-02-28) [2023-09-12].. |
10 | BRISIMI T S, CHEN R, MELA T, et al. Federated learning of predictive models from federated electronic health records[J]. International Journal of Medical Informatics, 2018, 112: 59-67. |
11 | LI T, SAHU A K, TALWALKAR A, et al. Federated learning: challenges, methods, and future directions[J]. IEEE Signal Processing Magazine, 2020, 37(3): 50-60. |
12 | JIN H, BAI D, YAO D, et al. Personalized edge intelligence via federated self-knowledge distillation[J]. IEEE Transactions on Parallel and Distributed Systems, 2023, 34(2): 567-580. |
13 | LI H, CAI Z, WANG J, et al. FedTP: federated learning by Transformer personalization[EB/OL]. (2023-04-18) [2023-09-12].. |
14 | SATTLER F, MÜLLER K R, SAMEK W. Clustered federated learning: model-agnostic distributed multitask optimization under privacy constraints[J]. IEEE Transactions on Neural Networks and Learning Systems, 2021, 32(8): 3710-3722. |
15 | SATTLER F, MÜLLER K R, WIEGAND T, et al. On the Byzantine robustness of clustered federated learning[C]// Proceedings of the 2020 IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway: IEEE, 2020: 8861-8865. |
16 | DENNIS D K, LI T, SMITH V. Heterogeneity for the win: one-shot federated clustering[C]// Proceedings of the 38th International Conference on Machine Learning. New York: JMLR.org, 2021: 2611-2620. |
17 | GHOSH A, CHUNG J, YIN D, et al. An efficient framework for clustered federated learning[J]. IEEE Transactions on Information Theory, 2022, 68(12): 8076-8091. |
18 | BRIGGS C, FAN Z, ANDRAS P. Federated learning with hierarchical clustering of local updates to improve training on non-IID data[C]// Proceedings of the 2020 International Joint Conference on Neural Networks. Piscataway: IEEE, 2020: 1-9. |
19 | LI C, LI G, VARSHNEY P K. Federated learning with soft clustering[J]. IEEE Internet of Things Journal, 2022, 9(10): 7773-7782. |
20 | DUAN M, LIU D, CHEN X, et al. Self-balancing federated learning with global imbalanced data in mobile systems[J]. IEEE Transactions on Parallel and Distributed Systems, 2021, 32(1): 59-71. |
21 | LI T, SAHU A K, ZAHEER M, et al. Federated optimization in heterogeneous networks[EB/OL]. [2023-09-12].. |
22 | MOTHUKURI V, PARIZI R M, POURIYEH S, et al. A survey on security and privacy of federated learning[J]. Future Generation Computer Systems, 2021, 115: 619-640. |
23 | ZHAO Y, LI M, LAI L, et al. Federated learning with non-IID data[EB/OL]. (2022-07-21) [2023-09-12].. |
24 | LU R, ZHANG W, WANG Y, et al. Auction-based cluster federated learning in mobile edge computing systems[J]. IEEE Transactions on Parallel and Distributed Systems, 2023, 34(4): 1145-1158. |
25 | DUAN M, LIU D, JI X, et al. Flexible clustered federated learning for client-level data distribution shift[J]. IEEE Transactions on Parallel and Distributed Systems, 2022, 33(11): 2661-2674. |
26 | ZHANG Y, LIU D, DUAN M, et al. FedMDS: an efficient model discrepancy-aware semi-asynchronous clustered federated learning framework[J]. IEEE Transactions on Parallel and Distributed Systems, 2023, 34(3): 1007-1019. |
27 | LLOYD S. Least squares quantization in PCM[J]. IEEE Transactions on Information Theory, 1982, 28(2): 129-137. |
28 | RUAN Y, JOE-WONG C. FedSoft: soft clustered federated learning with proximal local updating[C]// Proceedings of the 36th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2022: 8124-8131. |
29 | TIAN P, LIAO W X, YU W, et al. WSCC: a weight-similarity-based client clustering approach for non-IID federated learning[J]. IEEE Internet of Things Journal, 2022, 9(20): 20243-20256. |
30 | LONG G, XIE M, SHEN T, et al. Multi-center federated learning: clients clustering for better personalization[J]. World Wide Web, 2023, 26: 481-500. |
31 | AGRAWAL S, SARKAR S, ALAZAB M, et al. Genetic CFL: hyperparameter optimization in clustered federated learning[J]. Computational Intelligence and Neuroscience, 2021, 2021: No.7156420. |
32 | 鲁晨阳,邓苏,马武彬,等. 基于DBSCAN聚类的集群联邦学习方法[J]. 计算机科学, 2022, 49(6A):232-237. |
LU C Y, DENG S, MA W B, et al. Clustered federated learning methods based on DBSCAN clustering[J]. Computer Science, 2022, 49(6A):232-237. | |
33 | 常黎明,刘颜红,徐恕贞. 基于数据分布的聚类联邦学习[J]. 计算机应用研究, 2023, 40(6):1697-1701. |
CHANG L M, LIU Y H, XU S Z. Clustering federated learning based on data distribution[J]. Application Research of Computers, 2023, 40(6):1697-1701. | |
34 | STALLMANN M, WILBIK A. Towards federated clustering: a Federated Fuzzy c-Means algorithm (FFCM)[EB/OL]. (2022-01-18) [2023-09-12].. |
35 | XIE H, MA J, XIONG L, et al. Federated graph classification over non-IID graphs[C]// Proceedings of the 35th Conference on Neural Information Processing Systems. New York: ACM, 2024: 18839-18852. |
36 | COHEN G, AFSHAR S, TAPSON J, et al. EMNIST: extending MNIST to handwritten letters[C]// Proceedings of the 2017 International Joint Conference on Neural Networks. Piscataway: IEEE, 2017: 2921-2926. |
37 | KRIZHEVSKY A. Learning multiple layers of features from tiny images[D]. Toronto: University of Toronto, 2009: 1-60. |
38 | XIAO H, RASUL K, VOLLGRAF R. Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms[EB/OL]. (2017-09-15) [2023-09-12].. |
39 | MA J, LONG G, ZHOU T, et al. On the convergence of clustered federated learning[EB/OL]. (2022-06-07) [2023-09-12].. |
[1] | 陈廷伟, 张嘉诚, 王俊陆. 面向联邦学习的随机验证区块链构建[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2770-2776. |
[2] | 王娜, 蒋林, 李远成, 朱筠. 基于图形重写和融合探索的张量虚拟机算符融合优化[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2802-2809. |
[3] | 李云, 王富铕, 井佩光, 王粟, 肖澳. 基于不确定度感知的帧关联短视频事件检测方法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2903-2910. |
[4] | 唐廷杰, 黄佳进, 秦进. 基于图辅助学习的会话推荐[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2711-2718. |
[5] | 张睿, 张鹏云, 高美蓉. 自优化双模态多通路非深度前庭神经鞘瘤识别模型[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2975-2982. |
[6] | 秦璟, 秦志光, 李发礼, 彭悦恒. 基于概率稀疏自注意力神经网络的重性抑郁疾患诊断[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2970-2974. |
[7] | 方介泼, 陶重犇. 应对零日攻击的混合车联网入侵检测系统[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2763-2769. |
[8] | 杨航, 李汪根, 张根生, 王志格, 开新. 基于图神经网络的多层信息交互融合算法用于会话推荐[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2719-2725. |
[9] | 杨兴耀, 陈羽, 于炯, 张祖莲, 陈嘉颖, 王东晓. 结合自我特征和对比学习的推荐模型[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2704-2710. |
[10] | 李顺勇, 李师毅, 胥瑞, 赵兴旺. 基于自注意力融合的不完整多视图聚类算法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2696-2703. |
[11] | 姚光磊, 熊菊霞, 杨国武. 基于神经网络优化的花朵授粉算法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2829-2837. |
[12] | 黄颖, 杨佳宇, 金家昊, 万邦睿. 用于RGBT跟踪的孪生混合信息融合算法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2878-2885. |
[13] | 杜郁, 朱焱. 构建预训练动态图神经网络预测学术合作行为消失[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2726-2731. |
[14] | 赵宇博, 张丽萍, 闫盛, 侯敏, 高茂. 基于改进分段卷积神经网络和知识蒸馏的学科知识实体间关系抽取[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2421-2429. |
[15] | 陈虹, 齐兵, 金海波, 武聪, 张立昂. 融合1D-CNN与BiGRU的类不平衡流量异常检测[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2493-2499. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||