《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (2): 437-449.DOI: 10.11772/j.issn.1001-9081.2021122072

• 网络空间安全 • 上一篇    下一篇

联邦学习中的隐私保护技术研究综述

王腾1, 霍峥2(), 黄亚鑫2, 范艺琳2   

  1. 1.中国电科网络通信研究院,石家庄 050081
    2.河北经贸大学 信息技术学院,石家庄 050061
  • 收稿日期:2021-12-09 修回日期:2022-01-21 接受日期:2022-01-28 发布日期:2023-02-08 出版日期:2023-02-10
  • 通讯作者: 霍峥
  • 作者简介:王腾(1980—),男,贵州遵义人,高级工程师,博士,主要研究方向:机器学习、数字化治理
    黄亚鑫(1999—),男,河北邢台人,硕士研究生,主要研究方向:隐私保护
    范艺琳(1998—),女,河北石家庄人,硕士研究生,主要研究方向:联邦学习。
  • 基金资助:
    国家自然科学基金资助项目(62002098);河北省自然科学基金资助项目(F2020207001)

Review on privacy-preserving technologies in federated learning

Teng WANG1, Zheng HUO2(), Yaxin HUANG2, Yilin FAN2   

  1. 1.China Electronics Technology Group Corporation Network Communication Research Institute,Shijiazhuang Hebei 050081,China
    2.School of Information Technology,Hebei University of Economics and Business,Shijiazhuang Hebei 050061,China
  • Received:2021-12-09 Revised:2022-01-21 Accepted:2022-01-28 Online:2023-02-08 Published:2023-02-10
  • Contact: Zheng HUO
  • About author:WANG Teng, born in 1980, Ph. D., senior engineer. His research interests include machine learning, digital governance.
    HUANG Yaxin, born in 1999, M. S. candidate. His research interests include privacy-preserving.
    FAN Yilin, born in 1998, M. S. candidate. Her research interests include federated learning.
  • Supported by:
    National Natural Science Foundation of China(62002098);Natural Science Foundation of Hebei Province(F2020207001)

摘要:

近年来,联邦学习成为解决机器学习中数据孤岛与隐私泄露问题的新思路。联邦学习架构不需要多方共享数据资源,只要参与方在本地数据上训练局部模型,并周期性地将参数上传至服务器来更新全局模型,就可以获得在大规模全局数据上建立的机器学习模型。联邦学习架构具有数据隐私保护的特质,是未来大规模数据机器学习的新方案。然而,该架构的参数交互方式可能导致数据隐私泄露。目前,研究如何加强联邦学习架构中的隐私保护机制已经成为新的热点。从联邦学习中存在的隐私泄露问题出发,探讨了联邦学习中的攻击模型与敏感信息泄露途径,并重点综述了联邦学习中的几类隐私保护技术:以差分隐私为基础的隐私保护技术、以同态加密为基础的隐私保护技术、以安全多方计算(SMC)为基础的隐私保护技术。最后,探讨了联邦学习中隐私保护中的若干关键问题,并展望了未来研究方向。

关键词: 联邦学习, 隐私保护, 差分隐私, 同态加密, 安全多方计算

Abstract:

In recent years, federated learning has become a new way to solve the problems of data island and privacy leakage in machine learning. Federated learning architecture does not require multiple parties to share data resources, in which participants only needed to train local models on local data and periodically upload parameters to the server to update the global model, and then a machine learning model can be built on large-scale global data. Federated learning architecture has the privacy-preserving nature and is a new scheme for large-scale data machine learning in the future. However, the parameter interaction mode of this architecture may lead to data privacy disclosure. At present, strengthening the privacy-preserving mechanism in federated learning architecture has become a new research hotspot. Starting from the privacy disclosure problem in federated learning, the attack models and sensitive information disclosure paths in federated learning were discussed, and several types of privacy-preserving techniques in federated learning were highlighted and reviewed, such as privacy-preserving technology based on differential privacy, privacy-preserving technology based on homomorphic encryption, and privacy-preserving technology based on Secure Multiparty Computation (SMC). Finally, the key issues of privacy protection in federated learning were discussed, the future research directions were prospected.

Key words: federated learning, privacy-preserving, differential privacy, homomorphic encryption, Secure Multiparty Computation (SMC)

中图分类号: