《计算机应用》唯一官方网站

• •    下一篇

基于聚类评估的联邦学习投毒攻击防御算法

夏彬杰1,缪祥华1,2,刘义良3,吕艳1   

  1. 1. 昆明理工大学
    2. 昆明理工大学信息工程与自动化学院
    3. 昆明理工大小
  • 收稿日期:2025-11-18 修回日期:2026-03-29 接受日期:2026-04-08 发布日期:2026-04-23 出版日期:2026-04-23
  • 通讯作者: 缪祥华
  • 基金资助:
    云南省重大专项计划

Federated learning poisoning attack defense algorithm research based on cluster evaluation

  • Received:2025-11-18 Revised:2026-03-29 Accepted:2026-04-08 Online:2026-04-23 Published:2026-04-23

摘要: 摘 要: 为解决联邦学习中当存在不同数量以及较多恶意客户端时,特别是非独立同分布(Non-IID)场景下,全局模型性能损失严重的问题,本文提出一种基于聚类评估的联邦学习投毒攻击防御算法FedCE。首先利用余弦距离和曼哈顿距离剔除混合距离和最小的簇,最大限度保留客户端信息;随后结合上轮全局模型聚合参数和聚合客户端进行历史聚合信息评估筛选最优聚类结果;最后通过四分位距法(IQR)统计各客户端异常梯度占比动态调整聚合权重,提升全局模型的鲁棒性。实验结果表明,在数据为非独立同分布(Non-IID)和恶意客户端比例为40%时,相较于LASA算法,FedCE在MNIST数据集上对于MinMax、Mimic、LabelFlipping、Neurotoxin四种投毒攻击的模型准确率分别提升了1.64、0.67、0.03和0.54个百分点,在CIFAR10数据集上模型准确率分别提升了1.28、9.45、1.46和4.67个百分点。在2个数据集的独立同分布和非独立同分布数据划分下,面对不同比例的恶意客户端和类型的投毒攻击,FedCE的模型准确率相比5个主流的防御算法都有明显的优势。

关键词: 联邦学习, 投毒攻击, 异常检测, 四分位距法, 聚类评估

Abstract: Abstract: In order to solve the problem of serious performance loss of global model when there are different numbers and more malicious clients in federated learning, especially in the scenario of Non-Independent and Identically Distributed (Non-IID) distributed data, a federated learning poisoning attack defense algorithm FedCE based on clustering evaluation is proposed. Firstly, the cosine distance and Manhattan distance are used to eliminate the cluster with the smallest sum of mixed distances, and the client information is retained to the greatest extent. Then, the historical aggregation information is evaluated and screened by combining the aggregation parameters of the last round of global model and the aggregation clients to select the optimal clustering results. Finally, the proportion of abnormal gradients of each client is calculated by the Interquartile Range (IQR) method, and the aggregation weight is dynamically adjusted to improve the robustness of the global model. The experimental results show that when the data is Non-Independent and Identically Distributed (Non-IID) and the proportion of malicious clients is high, compared with the LASA algorithm, the model accuracy of FedCE for MinMax, Mimic, LabelFlipping and Neurotoxin on the MNIST dataset is improved by 1.64,0.67,0.03 and 0.54 percentage points respectively, and the model accuracy on the CIFAR10 dataset is improved by 1.28,9.45,1.46 and 4.67 percentage points respectively. Under the independent identically distributed and non-independent identically distributed data partition of the two data sets, in the face of different proportions of malicious clients and types of poisoning attacks, the model accuracy of FedCE has obvious advantages over the five mainstream defense algorithms.

Key words: federated learning, poisoning attack, anomaly detection, interquartile range, cluster evaluation

中图分类号: