Journal of Computer Applications ›› 2020, Vol. 40 ›› Issue (10): 2980-2985.DOI: 10.11772/j.issn.1001-9081.2020030332

• Cyber security • Previous Articles     Next Articles

Federated security tree algorithm for user privacy protection

ZHANG Junru1, ZHAO Xiaoyan1,2, YUAN Peiyan1,2   

  1. 1. College of Computer and Information Engineering, Henan Normal University, Xinxiang Henan 453007, China;
    2. Engineering Laboratory of Big Data for Teaching Resources and Assessment of Education Quality in Henan Province;(Henan Normal University), Xinxiang Henan 453007, China
  • Received:2020-03-21 Revised:2020-05-27 Online:2020-10-10 Published:2020-06-03
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (U1804164), the Science and Technology Project of Henan Province (202102210323), the Key Scientific Research Program for Henan Universities and Colleges (19A510015).

面向用户隐私保护的联邦安全树算法

张君如1, 赵晓焱1,2, 袁培燕1,2   

  1. 1. 河南师范大学 计算机与信息工程学院, 河南 新乡 453007;
    2. 教学资源与教育质量评估大数据河南省工程实验室(河南师范大学), 河南 新乡 453007
  • 通讯作者: 赵晓焱
  • 作者简介:张君如(1999-),女,河南新乡人,主要研究方向:移动边缘计算、机器学习;赵晓焱(1981-),女,河南许昌人,副教授,博士,主要研究方向:移动边缘计算、机器学习;袁培燕(1978-),男,河南邓州人,副教授,博士,CCF高级会员,主要研究方向:移动网络、无线通信。
  • 基金资助:
    国家自然科学基金资助项目(U1804164);河南省科技攻关项目(202102210323);河南省高等学校重点科研项目(19A510015)。

Abstract: Aiming at the problems of low accuracy and low operation efficiency of federated learning algorithm in user behavior prediction, a loss-free Federated Learning Security tree (FLSectree) algorithm was proposed. Firstly, through the derivation of the loss function, its first partial derivative and second partial derivative were proved to be sensitive data, and the optimal split point after encryption was returned by scanning and splitting the feature index sequence, so as to protect the sensitive data from being disclosed. Then, by updating the instance space, the splitting was continued and the next best split point was found until the termination condition was satisfied. Finally, the results of training were used to obtain local algorithm parameters for each participant. Experimental results show that the FLSectree algorithm can effectively improve the accuracy and the training efficiency of user behavior prediction algorithm under the premise of protecting the data privacy. Compared with the SecureBoost algorithm in Federated AI Technology Enabler (FATE) framework of federated learning, FLSectree algorithm has the user behavior prediction accuracy increased by 9.09% and has the operation time reduced by 87.42%, and the training results are consistent with centralized Xgboost algorithm.

Key words: federated learning, machine learning, data privacy, Xgboost algorithm, user behavior prediction

摘要: 针对联邦学习算法在用户行为预测中存在的准确率低和运行效率不高等问题,提出一种无损失的联邦学习安全树(FLSectree)算法。首先,通过对损失函数的推导,证明损失函数的一阶偏导数与二阶偏导数为敏感数据,采用特征索引序列的扫描和分裂来返回加密后的最佳分裂点,以保护敏感数据不被泄露;接着,通过对实例空间的更新来继续向下分裂并寻找下一个最佳分裂点,直至满足终止条件后结束训练;最后,利用训练后的结果使得各参与方得到本地算法参数。实验结果表明,FLSectree算法能够在保护数据隐私的前提下有效提高用户行为预测算法的准确率和训练效率,与联邦学习FATE(Federated AI Technology Enabler)框架中的SecureBoost算法相比,FLSectree算法在用户行为预测中的准确率提高了9.09%,运行时间降低了87.42%,训练结果与集中式Xgboost算法一致。

关键词: 联邦学习, 机器学习, 数据隐私, Xgboost算法, 用户行为预测

CLC Number: