Lazy client identification method in federated learning based on proof-of-work

doi:10.11772/j.issn.1001-9081.2024030296

Abstract

Abstract:

In today’s society with the growing demand for privacy protection， federated learning is receiving widespread attention. However， in federated learning， it is difficult for the server to supervise behaviors of clients， so that the existence of lazy clients poses a potential threat to the performance and fairness of federated learning. Aiming at the problem of how to identify lazy clients efficiently and accurately， a dual-task proof-of-work method based on backdoor was proposed， namely FedBD （FedBackDoor）. In FedBD， additional backdoor tasks that are easier to detect were allocated by the server for the clients participating in federated learning， the backdoor tasks were trained by the clients based on the original training tasks， and the clients’ behaviors were supervised by the server indirectly through training status of the backdoor tasks. Experimental results show that FedBD has certain advantages over the classic federated averaging algorithm FedAvg and the advanced algorithm GTG-Shapley （Guided Truncation Gradient Shapley） on datasets such as MNIST and CIFAR10. On CIFAR10 dataset， when the proportion of lazy clients is 15%， FedBD improves the accuracy by more than 10 percentage points compared with FedAvg， and increases the accuracy by 2 percentage points compared with GTG-Shapley. Moreover， the average training time of FedBD is only 11.8% of that of GTG-Shapley， and the accuracy of FedBD in identifying lazy clients can exceed 99% when the proportion of lazy clients is 10%. It can be seen that FedBD can solve the problem of lazy clients being difficult to supervise.

Key words: federated learning, backdoor, lazy client, proof-of-work, data heterogeneity

摘要：

在对隐私保护的需求不断增长的当今社会，联邦学习正受到广泛关注。然而，在联邦学习中，服务器难以监管客户端的行为，致使懒惰客户端的存在为联邦学习的性能与公平性带来了潜在威胁。针对如何高效又准确地辨别懒惰客户端的问题，提出设置基于后门的双任务工作证明方法FedBD（FedBackDoor）。在FedBD中，服务器为参与联邦学习的客户端额外指定更易检测的后门任务，客户端在训练原任务的基础上训练后门任务，而服务器通过后门任务的训练情况间接监管客户端的行为。实验结果表明，在MNIST、CIFAR10等数据集上，相较于经典联邦平均算法FedAvg和先进算法GTG-Shapley（Guided Truncation Gradient Shapley），FedBD有一定优势。在CIFAR10数据集上，在懒惰客户端占比设置为15%时，FedBD比FedAvg的准确率提升可达10个百分点以上，比GTG-Shapley的准确率提升约2个百分点。此外，FedBD的平均训练时间仅为GTG-Shapley的11.8%，在懒惰客户端占比10%时辨别懒惰客户端的准确率可超过99%。可见，FedBD较好地解决了懒惰客户端难以监管的问题。

关键词: 联邦学习, 后门, 懒惰客户端, 工作证明, 数据异构

CLC Number:

TP181

Haili LIN, Jing LI. Lazy client identification method in federated learning based on proof-of-work[J]. Journal of Computer Applications, 2025, 45(3): 856-863.

林海力, 李京. 基于工作证明的联邦学习懒惰客户端识别方法[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 856-863.

Figures/Tables 7

References 46

1	McMAHAN H B， MOORE E， RAMAGE D， et al. Communication-efficient learning of deep networks from decentralized data ［C］// Proceedings of the 20th International Conference on Artificial Intelligence and Statistics. New York： JMLR.org， 2017： 1273-1282.
2	周传鑫，孙奕，汪德刚，等. 联邦学习研究综述［J］. 网络与信息安全学报， 2021， 7（5）： 77-92.
	ZHOU C X， SUN Y， WANG D G， et al. Survey of federated learning research ［J］. Chinese Journal of Network and Information Security， 2021， 7（5）： 77-92.
3	QIU J， WU Q， DING G， et al. A survey of machine learning for big data processing ［J］. EURASIP Journal on Advances in Signal Processing， 2016， 2016： No.67.
4	LIN J， DU M， LIU J. Free-riders in federated learning： attacks and defenses［EB/OL］. ［2023-10-28］. .
5	WANG G， DANG C X， ZHOU Z. Measure contribution of participants in federated learning ［C］// Proceedings of the 2019 IEEE International Conference on Big Data. Piscataway： IEEE， 2019： 2597-2604.
6	JIA R， DAO D， WANG B， et al. Towards efficient data valuation based on the shapley value ［C］// Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics. New York： JMLR.org， 2019： 1167-1176.
7	LIU Z， CHEN Y， YU H， et al. GTG-Shapley： efficient and accurate participant contribution evaluation in federated learning［J］. ACM Transactions on Intelligent Systems and Technology， 2022， 13（4）： No.60.
8	XU X， LYU L， MA X， et al. Gradient driven rewards to guarantee fairness in collaborative machine learning ［C］// Proceedings of the 35th International Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2021： 16104-16117.
9	LV H， ZHENG Z， LUO T， et al. Data-free evaluation of user contributions in federated learning ［C］// Proceedings of the 19th International Symposium on Modeling and Optimization in Mobile， Ad Hoc， and Wireless Networks. Piscataway： IEEE， 2021： 1-8.
10	GU T， DOLAN-GAVITT B， GARG S. BadNets： identifying vulnerabilities in the machine learning model supply chain ［J］. IEEE Access， 2019， 7： 47230-47244.
11	LI Y， JIANG Y， LI Z， et al. Backdoor learning： a survey ［J］. IEEE Transactions on Neural Networks and Learning Systems， 2024， 35（1）： 5-22.
12	ADI Y， BAUM C， CISSE M， et al. Turning your weakness into a strength： watermarking deep neural networks by backdooring ［C］// Proceedings of the 27th USENIX Security Symposium. Berkeley： USENIX， 2018： 1615-1631.
13	LI J， SHAO Y， DING M， et al. Blockchain Assisted Decentralized Federated Learning （BLADE-FL） with lazy clients［EB/OL］. ［2022-12-23］. .
14	王勇，李国良，李开宇. 联邦学习贡献评估综述［J］. 软件学报， 2023， 34（3）： 1168-1192.
	WANG Y， LI G L， LI K Y. Survey on contribution evaluation for federated learning ［J］. Journal of Software， 2023， 34（3）： 1168-1192.
15	YU H， LIU Z， LIU Y， et al. A sustainable incentive scheme for federated learning ［J］. IEEE Intelligent Systems， 2020， 35（4）： 58-69.
16	ZHANG J， LI C， ROBLES-KELLY A， et al. Hierarchically fair federated learning ［EB/OL］. ［2023-05-01］. .
17	QI Y， HOSSAIN M S， NIE J， et al. Privacy-preserving blockchain-based federated learning for traffic flow prediction ［J］. Future Generation Computer Systems， 2021， 117： 328-337.
18	KANG J， XIONG Z， NIYATO D， et al. Reliable federated learning for mobile networks ［J］. IEEE Wireless Communications， 2020， 27（2）： 72-80.
19	LIU Y， PENG J， KANG J， et al. A secure federated learning framework for 5G networks ［J］. IEEE Wireless Communications， 2020， 27（4）： 24-31.
20	LYU L， YU J， NANDAKUMAR K， et al. Towards fair and privacy-preserving federated deep models ［J］. IEEE Transactions on Parallel and Distributed Systems， 2020， 31（11）： 2524-2541.
21	PANDEY S R， TRAN N H， BENNIS M， et al. A crowdsourcing framework for on-device federated learning ［J］. IEEE Transactions on Wireless Communications， 2020， 19（5）： 3241-3256.
22	ZHAO B， LIU X， CHEN W N. When crowdsensing meets federated learning： privacy-preserving mobile crowdsensing system［EB/OL］. ［2023-10-20］. .
23	CHEN Y， YANG X， QIN X， et al. Dealing with label quality disparity in federated learning ［M］// YANG Q， FAN L， YU H. Federated learning： privacy and incentive， LNCS 12500. Cham： Springer， 2020： 108-121.
24	SHYN S K， KIM D， KIM K. FedCCEA： a practical approach of client contribution evaluation for federated learning ［EB/OL］.［2023-06-04］. .
25	RICHARDSON A， FILOS-RATSIKAS A， FALTINGS B. Rewarding high-quality data via influence functions ［EB/OL］. ［2022-08-30］. .
26	SHAPLEY L S. A value for n-person games ［M］// KUHN H W， TUCKER A W. Contributions to the theory of games， volume Ⅱ， Annals of Mathematics Studies. Princeton： Princeton University Press， 1953： 307-318.
27	VAN CAMPEN T， HAMERS H， HUSSLAGE B， et al. A new approximation method for the Shapley value applied to the WTC 9/11 terrorist attack ［J］. Social Network Analysis and Mining， 2018， 8： No.3.
28	GHORBANI A， ZOU J. Data shapley： equitable valuation of data for machine learning ［C］// Proceedings of the 36th International Conference on Machine Learning. New York： JMLR.org， 2019： 2242-2251.
29	WANG T， RAUSCH J， ZHANG C， et al. A principled approach to data valuation for federated learning ［M］// YANG Q， FAN L， YU H. Federated learning： privacy and incentive， LNCS 12500. Cham： Springer， 2020： 153-167.
30	SONG T， TONG Y， WEI S. Profit allocation for federated learning［C］// Proceedings of the 2019 IEEE International Conference on Big Data. Piscataway： IEEE， 2019： 2577-2586.
31	FAN Z， FANG H， ZHOU Z， et al. Improving fairness for data valuation in horizontal federated learning ［C］// Proceedings of the IEEE 38th International Conference on Data Engineering. Piscataway： IEEE， 2022： 2440-2453.
32	FAN Z， FANG H， WANG X， et al. Fair and efficient contribution valuation for vertical federated learning ［EB/OL］. ［2024-04-07］..
33	YANG Q， LIU Y， CHEN T， et al. Federated machine learning： concept and applications ［J］. ACM Transactions on Intelligent Systems and Technology， 2019， 10（2）： No.12.
34	LI Y， ZHU L， JIA X， et al. Defending against model stealing via verifying embedded external features ［C］// Proceedings of the 36th AAAI Conference on Artificial Intelligence. Palo Alto： AAAI Press， 2022： 1464-1472.
35	SOMMER D M， SONG L， WAGH S， et al. Towards probabilistic verification of machine unlearning ［EB/OL］. ［2023-12-01］..
36	LI Y， ZHANG Z， BAI J， et al. Open-sourced dataset protection via backdoor watermarking ［EB/OL］. ［2023-11-19］. .
37	ZHOU D W， YE H J， ZHAN D C. Learning placeholders for open-set recognition ［C］// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2021： 4399-4408.
38	SALEHI M， MIRZAEI H， HENDRYCKS D， et al. A unified survey on anomaly， novelty， open-set， and out-of-distribution detection： solutions and future challenges ［EB/OL］. ［2023-12-03］. .
39	WANG Z， YANG E， SHEN L， et al. A comprehensive survey of forgetting in deep learning beyond continual learning ［EB/OL］. ［2022-12-23］. .
40	WANG L， ZHANG M， JIA Z， et al. AFEC： active forgetting of negative transfer in continual learning ［C］// Proceedings of the 35th International Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2021： 22379-22391.
41	ZHOU H， VANI A， LAROCHELLE H， et al. Fortuitous forgetting in connectionist networks ［EB/OL］. ［2023-12-18］..
42	LI X， XIONG H， AN H， et al. RIFLE： backpropagation in depth for deep transfer learning through Re-Initializing the Fully-connected LayEr ［C］// Proceedings of the 37th International Conference on Machine Learning. New York： JMLR.org， 2020： 6010-6019.
43	FRENCH R M. Catastrophic forgetting in connectionist networks［J］. Trends in Cognitive Sciences， 1999， 3（4）： 128-135.
44	LeCUN Y， BOTTOU L， BENGIO Y， et al. Gradient-based learning applied to document recognition ［J］. Proceedings of the IEEE， 1998， 86（11）： 2278-2324.
45	KRIZHEVSKY A. Learning multiple layers of features from tiny images［R/OL］. ［2023-09-14］. .
46	HSU T M H， QI H， BROWN M. Measuring the effects of non-identical data distribution for federated visual classification ［EB/OL］. ［2023-09-13］. .

方法	不同l下的准确率
方法	l=10%	l=20%	l=30%	l=40%
FedAvg	0.585 3	0.473 8	0.402 8	0.387 6
MCS	0.588 9	0.560 9	0.419 4	0.406 5
FedPCA	0.646 7	0.613 3	0.476 5	0.423 8
GTG-S	0.651 1	0.641 2	0.612 1	0.618 0
FedBD	0.649 8	0.650 4	0.627 8	0.608 7

方法	不同l下的准确率
方法	l=10%	l=20%	l=30%	l=40%
FedAvg	0.585 3	0.473 8	0.402 8	0.387 6
MCS	0.588 9	0.560 9	0.419 4	0.406 5
FedPCA	0.646 7	0.613 3	0.476 5	0.423 8
GTG-S	0.651 1	0.641 2	0.612 1	0.618 0
FedBD	0.649 8	0.650 4	0.627 8	0.608 7

数据集	方法	不同M和a下的准确率
数据集	方法	M=50， a=0.5	M=50， a=0.2	M=20， a=0.5	M=20， a=0.2
MNIST	FedAvg	0.868 8	0.823 5	0.895 4	0.853 5
	MCS	0.876 5	0.800 4	0.834 6	0.810 0
	FedPCA	0.901 2	0.840 6	0.912 7	0.895 3
	GTG-S	0.918 7	0.863 3	0.938 8	0.907 1
	FedBD	0.907 6	0.895 4	0.944 4	0.919 9
CIFAR10	FedAvg	0.521 0	0.490 4	0.549 4	0.536 5
	MCS	0.572 4	0.530 3	0.592 8	0.537 1
	FedPCA	0.619 4	0.593 2	0.650 6	0.621 9
	GTG-S	0.617 7	0.600 9	0.682 5	0.674 3
	FedBD	0.639 7	0.591 8	0.672 7	0.678 0

数据集	方法	不同M和a下的准确率
数据集	方法	M=50， a=0.5	M=50， a=0.2	M=20， a=0.5	M=20， a=0.2
MNIST	FedAvg	0.868 8	0.823 5	0.895 4	0.853 5
	MCS	0.876 5	0.800 4	0.834 6	0.810 0
	FedPCA	0.901 2	0.840 6	0.912 7	0.895 3
	GTG-S	0.918 7	0.863 3	0.938 8	0.907 1
	FedBD	0.907 6	0.895 4	0.944 4	0.919 9
CIFAR10	FedAvg	0.521 0	0.490 4	0.549 4	0.536 5
	MCS	0.572 4	0.530 3	0.592 8	0.537 1
	FedPCA	0.619 4	0.593 2	0.650 6	0.621 9
	GTG-S	0.617 7	0.600 9	0.682 5	0.674 3
	FedBD	0.639 7	0.591 8	0.672 7	0.678 0

方法	辨识准确率		辨识召回率
方法	l=10%	l=30%	l=10%	l=30%
MCS	0.962 5	0.496 7	0.902 5	0.540 0
FedPCA	0.977 5	0.792 5	0.948 3	0.678 7
GTG-S	0.990 0	0.975 0	0.975 0	0.950 0
FedBD	0.992 5	0.972 5	0.982 5	0.966 7