Journal of Computer Applications ›› 2022, Vol. 42 ›› Issue (4): 1131-1136.DOI: 10.11772/j.issn.1001-9081.2021071264

• The 36 CCF National Conference of Computer Applications (CCF NCCA 2020) • Previous Articles    

Improved federated weighted average algorithm

Changyin LUO1,2,3, Junyu WANG1,2,3, Xuebin CHEN1,2,3(), Chundi MA1, Shufen ZHANG1,2,3   

  1. 1.College of Science,North China University of Science and Technology,Tangshan Hebei 063210,China
    2.Hebei Province Key Laboratory of Data Science and Application (North China University of Science and Technology),Tangshan Hebei 063210,China
    3.Tangshan Data Science Laboratory (North China University of Science and Technology),Tangshan Hebei 063210,China
  • Received:2021-07-16 Revised:2021-10-13 Accepted:2021-10-18 Online:2021-10-13 Published:2022-04-10
  • Contact: Xuebin CHEN
  • About author:LUO Changyin, born in 1994, M. S. candidate. His research interests include data security.
    WANG Junyu, born in 1996, M. S. candidate. Her research interests include data security.
    MA Chundi, born in 1999. His research interests include network security.
    ZHANG Shufen, born in 1972, Ph. D., professor. Her research interests include data security.
    First author contact:CHEN Xuebin, born in 1970, Ph. D., professor. His research interests include data security, IoT security, network security.
  • Supported by:
    National Natural Science Foundation of China(U20A20179);Tangshan Science and Technology Project(18120203A)

改进的联邦加权平均算法

罗长银1,2,3, 王君宇1,2,3, 陈学斌1,2,3(), 马春地1, 张淑芬1,2,3   

  1. 1.华北理工大学 理学院,河北 唐山 063210
    2.河北省数据科学与应用重点实验室(华北理工大学),河北 唐山 063210
    3.唐山市数据科学重点实验室(华北理工大学),河北 唐山 063210
  • 通讯作者: 陈学斌
  • 作者简介:罗长银(1994—),男,陕西安康人,硕士研究生,CCF会员,主要研究方向:数据安全
    王君宇(1996—),女,河北唐山人,硕士研究生,主要研究方向:数据安全
    马春地(1999—),男,河北唐山人,主要研究方向:网络安全
    张淑芬(1972—),女,河北唐山人,教授,博士,CCF高级会员,主要研究方向:数据安全。
  • 基金资助:
    国家自然科学基金资助项目(U20A20179);唐山市科技厅项目(18120203A)

Abstract:

Aiming at the problem that the improved federated average algorithm based on analytic hierarchy process was affected by subjective factors when calculating its data quality, an improved federated weighted average algorithm was proposed to process multi-source data from the perspective of data quality. Firstly, the training samples were divided into pre-training samples and pre-testing samples. Then, the accuracy of the initial global model on the pre-training data was used as the quality weight of the data source. Finally, the quality weight was introduced into the federated average algorithm to reupdate the weights in the global model. The simulation results show that the model trained by the improved federal weighted average algorithm get the higher accuracy compared with the model trained by the traditional federal average algorithm, which is improved by 1.59% and 1.24% respectively on equally divided and unequally divided datasets. At the same time, compared with the traditional multi-party data retraining method, although the accuracy of the proposed model is slightly reduced, the security of data and model is improved.

Key words: federated learning, Federated Average (FedAvg), federated weighted average algorithm, multi-source data, data quality

摘要:

针对基于层次分析改进的联邦平均算法在计算其数据质量时存在主观因素的影响,提出改进的联邦加权平均算法,从数据质量的角度来处理多源数据。首先,将训练样本划分为预训练样本与预测试样本;然后,使用初始全局模型在预训练数据上的精度作为该数据源的质量权重;最后,将质量权重引入到联邦平均算法中,重新进行全局模型中权重更新。仿真结果表明,在均等分割的数据集与非均等分割的数据集上,改进的联邦加权平均算法训练的模型与传统联邦平均算法训练的模型相比,准确率最高分别提升了1.59%和1.24%;改进的联邦加权平均算法训练的模型与传统整合多方数据再训练的模型相比,虽然准确率略有下降,但数据与模型的安全性有所提升。

关键词: 联邦学习, 联邦平均, 联邦加权平均算法, 多源数据, 数据质量

CLC Number: