• •    

基于相似度聚类和正则化的个性化联邦学习

巫婕1,钱雪忠2,宋威1   

  1. 1. 江南大学
    2. 江苏省无锡市江南大学蠡湖校区桂园9#521
  • 收稿日期:2023-12-06 修回日期:2024-03-06 发布日期:2024-03-22
  • 通讯作者: 巫婕
  • 基金资助:
    国家自然科学基金

Personalized federated learning based on similarity clustering and regularization

  • Received:2023-12-06 Revised:2024-03-06 Online:2024-03-22
  • Supported by:
    the National Natural Science Foundation of China

摘要: 联邦学习应用场景中,常面临客户端数据异质性和不同任务需求需要提供个性化模型的问题,而现有的部分个性化联邦学习中存在个性化与全局泛化的权衡问题,并且采用传统FL中根据客户端数据量加权聚合,导致数据分布差异大的客户端模型性能变差,缺乏个性化聚合策略。针对上述问题,提出一种新的个性化联邦学习方法(pFedSCR)。pFedSCR算法在客户端本地更新阶段训练个性化模型和局部模型,个性化模型在交叉熵损失函数引入L2范数正则化,动态调整参考全局模型的程度,在汲取全局知识的基础上实现个性化;在服务端聚合阶段,根据客户端模型更新的相似度聚类,构建相似度矩阵,动态调整聚合权重为不同客户端聚合个性化模型,让参数聚合策略具有个性化的同时缓解了数据异构问题。实验结果表明,在CIFAR-10,MNIST等三种数据集上通过Dirichlet 分布模拟了多种Non-IID数据场景,pFedSCR算法在各种场景下的精度和通信效率都优于经典算法(FedProx)和最新个性化算法(FedPCL)等FL算法,最高可达到99.03%准确率。

关键词: 联邦学习, 非独立同分布, 余弦相似度, 正则化, 个性化联邦学习, 隐私安全

Abstract: Federated learning scenarios often face the problem of data heterogeneity and the need to provide personalized models for different task requirements. However, there is a trade-off between personalization and global generalization in some existing personalized federated learning. Most of them use traditional FL based on client data. Quantity-weighted aggregation, aggregation weights lack personalization. In response to the above problems, a personalized federated learning method (pFedSCR) based on similarity clustering and regularization is proposed. The pFedSCR algorithm trains the private model in the client's local update phase, introduces L2 norm regularization, and dynamically controls the extent to which the private model refers to the global model; in the server aggregation phase, it clusters based on the client model similarity and dynamically adjusts the client aggregation weight as Different clients aggregate personalized models. Experimental results show that non-IID data scenarios are simulated through Dirichlet distribution on three data sets such as CIFAR-10. Compared with five algorithms including the classic algorithm Fedprox and the latest personalized algorithm fedpcl, the pFedSCR algorithm can perform well in various scenarios. Get higher accuracy, up to 99.03% test accuracy.

Key words: Federated Learning (FL), Non-Independent Identical Distribution (Non-IID), Cosine Similarity, regularization, Personalized Federated Learning(PFL), privacy security

中图分类号: