Federated security tree algorithm for user privacy protection

doi:10.11772/j.issn.1001-9081.2020030332

Journal of Computer Applications ›› 2020, Vol. 40 ›› Issue (10): 2980-2985.DOI: 10.11772/j.issn.1001-9081.2020030332

• Cyber security • Previous Articles Next Articles

Federated security tree algorithm for user privacy protection

ZHANG Junru¹, ZHAO Xiaoyan^1,2, YUAN Peiyan^1,2

1. College of Computer and Information Engineering, Henan Normal University, Xinxiang Henan 453007, China;
2. Engineering Laboratory of Big Data for Teaching Resources and Assessment of Education Quality in Henan Province;(Henan Normal University), Xinxiang Henan 453007, China

Received:2020-03-21 Revised:2020-05-27 Online:2020-10-10 Published:2020-06-03
Supported by:
This work is partially supported by the National Natural Science Foundation of China (U1804164), the Science and Technology Project of Henan Province (202102210323), the Key Scientific Research Program for Henan Universities and Colleges (19A510015).

面向用户隐私保护的联邦安全树算法

张君如¹, 赵晓焱^1,2, 袁培燕^1,2

1. 河南师范大学计算机与信息工程学院, 河南新乡 453007;
2. 教学资源与教育质量评估大数据河南省工程实验室(河南师范大学), 河南新乡 453007

通讯作者: 赵晓焱
作者简介:张君如(1999-),女,河南新乡人,主要研究方向:移动边缘计算、机器学习;赵晓焱(1981-),女,河南许昌人,副教授,博士,主要研究方向:移动边缘计算、机器学习;袁培燕(1978-),男,河南邓州人,副教授,博士,CCF高级会员,主要研究方向:移动网络、无线通信。
基金资助:
国家自然科学基金资助项目（U1804164）；河南省科技攻关项目（202102210323）；河南省高等学校重点科研项目（19A510015）。

Abstract

Abstract: Aiming at the problems of low accuracy and low operation efficiency of federated learning algorithm in user behavior prediction, a loss-free Federated Learning Security tree (FLSectree) algorithm was proposed. Firstly, through the derivation of the loss function, its first partial derivative and second partial derivative were proved to be sensitive data, and the optimal split point after encryption was returned by scanning and splitting the feature index sequence, so as to protect the sensitive data from being disclosed. Then, by updating the instance space, the splitting was continued and the next best split point was found until the termination condition was satisfied. Finally, the results of training were used to obtain local algorithm parameters for each participant. Experimental results show that the FLSectree algorithm can effectively improve the accuracy and the training efficiency of user behavior prediction algorithm under the premise of protecting the data privacy. Compared with the SecureBoost algorithm in Federated AI Technology Enabler (FATE) framework of federated learning, FLSectree algorithm has the user behavior prediction accuracy increased by 9.09% and has the operation time reduced by 87.42%, and the training results are consistent with centralized Xgboost algorithm.

Key words: federated learning, machine learning, data privacy, Xgboost algorithm, user behavior prediction

摘要： 针对联邦学习算法在用户行为预测中存在的准确率低和运行效率不高等问题，提出一种无损失的联邦学习安全树（FLSectree）算法。首先，通过对损失函数的推导，证明损失函数的一阶偏导数与二阶偏导数为敏感数据，采用特征索引序列的扫描和分裂来返回加密后的最佳分裂点，以保护敏感数据不被泄露；接着，通过对实例空间的更新来继续向下分裂并寻找下一个最佳分裂点，直至满足终止条件后结束训练；最后，利用训练后的结果使得各参与方得到本地算法参数。实验结果表明，FLSectree算法能够在保护数据隐私的前提下有效提高用户行为预测算法的准确率和训练效率，与联邦学习FATE（Federated AI Technology Enabler）框架中的SecureBoost算法相比，FLSectree算法在用户行为预测中的准确率提高了9.09%，运行时间降低了87.42%，训练结果与集中式Xgboost算法一致。

关键词: 联邦学习, 机器学习, 数据隐私, Xgboost算法, 用户行为预测

CLC Number:

ZHANG Junru, ZHAO Xiaoyan, YUAN Peiyan. Federated security tree algorithm for user privacy protection[J]. Journal of Computer Applications, 2020, 40(10): 2980-2985.

张君如, 赵晓焱, 袁培燕. 面向用户隐私保护的联邦安全树算法[J]. 计算机应用, 2020, 40(10): 2980-2985.

References

[1] 王永贵, 宋真真, 肖成龙. 基于改进聚类和矩阵分解的协同过滤推荐算法[J]. 计算机应用,2018,38(4):1001-1006.(WANG Y G,SONG Z Z,XIAO C L. Collaborative filtering recommendation algorithm based on improved clustering and matrix factorization[J]. Journal of Computer Applications,2018,38(4):1001-1006.)
[2] 葛绍林, 叶剑, 何明祥. 基于深度森林的用户购买行为预测模型[J]. 计算机科学,2019,46(9):190-194.(GE S L,YE J,HE M X. User purchase behavior prediction model based on deep forest prediction model of user purchase behavior based on deep forest[J]. Computer Science,2019,46(9):190-194.)
[3] SALEHINEJAD H,RAHNAMAYAN S. Customer shopping pattern prediction:a recurrent neural network approach[C]//Proceedings of the 2016 IEEE Symposium Series on Computational Intelligence. Piscataway:IEEE,2016:1-6.
[4] BAUMANN A, HAUPT J, GEBERT F, et al. Changing perspectives:using graph metrics to predict purchase probabilities[J]. Expert Systems with Applications,2018,94:137-148.
[5] MCMAHAN B,MOORE E,RAMAGE D,et al. Communicationefficient learning of deep networks from decentralized data[C]//Proceedings of the 20th International Conference on Artificial Intelligence and Statistics. Cambridge,MA:JMLR,2017:1273-1282.
[6] KONEČNÝ J, MCMAHAN H B, YU F X, et al. Federated learning:strategies for improving communication efficiency[EB/OL].[2019-01-25]. https://arxiv.org/pdf/1610.05492.pdf.
[7] LIU Y,MA Z,LIU X,et al. Boosting privately:federated extreme boosting for mobile crowdsensing[EB/OL].[2019-07-24]. https://arxiv.org/pdf/1907.10218v2.pdf.
[8] CHENG K,FAN T,JIN Y,et al. SecureBoost:a lossless federated learning framework[EB/OL].[2019-01-25]. https://arxiv.org/pdf/1901.08755.pdf.
[9] 杨强. GDPR对AI的挑战和基于联邦迁移学习的对策[J]. 中国人工智能学会通讯,2018,8(8):1-8.(YANG Q. The challenge of GDPR to AI and countermeasures based on federated transfer learning[J]. Chinese Association for Artificial Intelligence,2018, 8(8):1-8.)
[10] 谭瑶, 饶文碧. 异构复合迁移学习的视频内容标注方法[J]. 计算机应用,2018,38(6):1547-1553.(TAN Y,RAO W B. Heterogeneous compound transfer learning method for video content annotation[J]. Journal of Computer Applications,2018, 38(6):1547-1553.)
[11] WANG S,TUOR T,SALONIDIS T,et al. Adaptive federated learning in resource constrained edge computing systems[J]. IEEE Journal on Selected Areas in Communications,2019,37(6):1205-1221.
[12] LIU Y,LIU Y,LIU Z,et al. Federated forest[EB/OL].[2019-05-24]. https://arxiv.org/pdf/1905.10053v1.pdf.
[13] SHARMA S,CHEN K. Poster:privacy-preserving boosting with random linear classifiers[C]//Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. New York:ACM,2018:2294-2296.
[14] CHEN S,LIU X,LI B. A cost-sensitive loss function for machine learning[C]//Proceedings of the 2018 International Conference on Database Systems for Advanced Applications, LNCS 10829. Cham:Springer,2018:255-268.
[15] ZHAO X,YUAN P,LI H,et al. Collaborative edge caching in context-aware device-to-device networks[J]. IEEE Transactions on Vehicular Technology,2018,67(10):9583-9596.

Federated security tree algorithm for user privacy protection

面向用户隐私保护的联邦安全树算法

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics

[1]	MAO Mingze, CAO Ruihao, YAN Chungang. Semi-supervised classification algorithm based on weight diversity [J]. Journal of Computer Applications, 2021, 41(9): 2473-2480.
[2]	GUO Mian, ZHANG Jinyou. Computation offloading policy for machine learning in mobile edge computing environments [J]. Journal of Computer Applications, 2021, 41(9): 2639-2645.
[3]	QIN Binbin, PENG Liangkang, LU Xiangming, QIAN Jiangbo. Research progress on driver distracted driving detection [J]. Journal of Computer Applications, 2021, 41(8): 2330-2337.
[4]	DONG Wentao, LI Zhuo, CHEN Xin. Online short video content distribution strategy based on federated learning [J]. Journal of Computer Applications, 2021, 41(6): 1551-1556.
[5]	WANG Jiarui, TAN Guoping, ZHOU Siyuan. Clustered wireless federated learning algorithm in high-speed internet of vehicles scenes [J]. Journal of Computer Applications, 2021, 41(6): 1546-1550.
[6]	QIN Jing, ZUO Changqing, WANG Zumin, JI Changqing, WANG Baofeng. Design of abnormal electrocardiograph monitoring model based on stacking classifier [J]. Journal of Computer Applications, 2021, 41(3): 887-890.
[7]	JIANG Qianyu, WANG Fengying, JIA Lipeng. Malware detection method based on perceptual hash algorithm and feature fusion [J]. Journal of Computer Applications, 2021, 41(3): 780-785.
[8]	MENG Xiangrui, YANG Wenzhong, WANG Ting. Survey of sentiment analysis based on image and text fusion [J]. Journal of Computer Applications, 2021, 41(2): 307-317.
[9]	LUO Changyin, CHEN Xuebin, MA Chundi, WANG Junyu. Online federated incremental learning algorithm for blockchain [J]. Journal of Computer Applications, 2021, 41(2): 363-371.
[10]	WANG Yahui, QIAN Yuhua, LIU Guoqing. Ordinal decision tree algorithm based on fuzzy advantage complementary mutual information [J]. Journal of Computer Applications, 2021, 41(10): 2785-2792.
[11]	JIANG Yangsheng, WANG Shengnan, TU Jiaqi, LI Sha, WANG Hongjun. Comprehensive prediction of thermal comfort and energy consumption for high-speed railway stations [J]. Journal of Computer Applications, 2021, 41(1): 249-257.
[12]	ZHU Lin, YU Haitao, LEI Xinyu, LIU Jing, WANG Ruofan. Brain network feature identification algorithm for Alzheimer's patients based on MRI image [J]. Journal of Computer Applications, 2020, 40(8): 2455-2459.
[13]	LIANG Denggao, ZHOU Anmin, ZHENG Rongfeng, LIU Liang, DING Jianwei. WeChat payment behavior recognition model based on division of large and small burst blocks [J]. Journal of Computer Applications, 2020, 40(7): 1970-1976.
[14]	XU Zhoubo, YANG Jian, LIU Huadong, HUANG Wenwen. Protein complex identification algorithm based on XGboost and topological structural information [J]. Journal of Computer Applications, 2020, 40(5): 1510-1514.
[15]	ZHANG Junsheng, XU Jingjing, YU Wei. No-reference image quality assessment method for facial beautification image [J]. Journal of Computer Applications, 2020, 40(4): 1184-1190.