《计算机应用》唯一官方网站 ›› 2025, Vol. 45 ›› Issue (1): 127-135.DOI: 10.11772/j.issn.1001-9081.2024010068
收稿日期:2024-01-19
									
				
											修回日期:2024-04-05
									
				
											接受日期:2024-04-07
									
				
											发布日期:2024-05-09
									
				
											出版日期:2025-01-10
									
				
			通讯作者:
					钱星颖
							作者简介:晏燕(1980—),女,甘肃兰州人,教授,博士,CCF高级会员,主要研究方向:隐私保护、信息安全;基金资助:
        
                                                                                                                            Yan YAN, Xingying QIAN( ), Pengbin YAN, Jie YANG
), Pengbin YAN, Jie YANG
			  
			
			
			
                
        
    
Received:2024-01-19
									
				
											Revised:2024-04-05
									
				
											Accepted:2024-04-07
									
				
											Online:2024-05-09
									
				
											Published:2025-01-10
									
			Contact:
					Xingying QIAN   
							About author:YAN Yan, born in 1980, Ph. D., professor. Her research interests include privacy protection, information security.Supported by:摘要:
针对分布式位置大数据收集导致的信息孤岛问题和位置隐私泄露面临的风险,提出一种基于联邦学习的位置大数据统计预测与隐私保护方法。首先,构建基于横向联邦学习的位置大数据统计预测发布框架,该框架允许各行政区域的数据收集者保留各自的原始数据,并使多个参与方通过交换训练参数来协同完成预测模型的训练任务;其次,针对具有时空序列特性的位置大数据密度统计预测问题,设计PVTv2-CBAM,以提高客户端预测结果的准确性;最后,提出一种差分隐私预算的动态分配和调整算法,并结合MMA (Modified Moments Accountant)机制实现对客户端模型的差分隐私保护。实验结果表明,相较于卷积神经网络(CNN)、长短期记忆(LSTM)网络、卷积LSTM(ConvLSTM)模型,PVTv2-CBAM在Yellow_tripdata数据集和T-Driver轨迹数据集上预测的平均绝对误差分别降低0~62%和39%~44%;所提差分隐私预算动态分配和调整算法在调整阈值为0.3和0.7时,使模型预测的准确率与无动态调整相比分别提高了约5%与6%。以上结果验证了所提方法的可行性和有效性。
中图分类号:
晏燕, 钱星颖, 闫鹏斌, 杨杰. 位置大数据的联邦学习统计预测与差分隐私保护方法[J]. 计算机应用, 2025, 45(1): 127-135.
Yan YAN, Xingying QIAN, Pengbin YAN, Jie YANG. Federated learning-based statistical prediction and differential privacy protection method for location big data[J]. Journal of Computer Applications, 2025, 45(1): 127-135.
| 模型 | 浮点运算量/FLOPs | 参数量/106 | 
|---|---|---|
| CNN | 15.50 | 0.174 | 
| LSTM | 13.46 | 0.166 | 
| ConvLSTM | 16.16 | 0.196 | 
| PVTv2 | 19.72 | 0.159 | 
| PVTv2-CBAM | 19.90 | 0.163 | 
表1 不同模型的计算量和参数量
Tab. 1 Computational volumes efficiency and parameter sizes of different models
| 模型 | 浮点运算量/FLOPs | 参数量/106 | 
|---|---|---|
| CNN | 15.50 | 0.174 | 
| LSTM | 13.46 | 0.166 | 
| ConvLSTM | 16.16 | 0.196 | 
| PVTv2 | 19.72 | 0.159 | 
| PVTv2-CBAM | 19.90 | 0.163 | 
| 模型 | MAE | RMSE | H(P,Q) | 
|---|---|---|---|
| CNN | 0.060±0.059 | 0.133±0.068 | 0.084±0.002 | 
| LSTM | 0.119±0.050 | 0.123±0.060 | 0.123±0.005 | 
| ConvLSTM | 0.159±0.088 | 0.185±0.045 | 0.119±0.010 | 
| PVTv2 | 0.097±0.083 | 0.127±0.073 | 0.051±0.013 | 
| PVTv2-CBAM | 0.060±0.007 | 0.093±0.019 | 0.053±0.005 | 
表2 不同模型在Yellow_tripdata数据集上的准确性评价指标
Tab. 2 Evaluation metrics of accuracy of different models on Yellow_tripdata dataset
| 模型 | MAE | RMSE | H(P,Q) | 
|---|---|---|---|
| CNN | 0.060±0.059 | 0.133±0.068 | 0.084±0.002 | 
| LSTM | 0.119±0.050 | 0.123±0.060 | 0.123±0.005 | 
| ConvLSTM | 0.159±0.088 | 0.185±0.045 | 0.119±0.010 | 
| PVTv2 | 0.097±0.083 | 0.127±0.073 | 0.051±0.013 | 
| PVTv2-CBAM | 0.060±0.007 | 0.093±0.019 | 0.053±0.005 | 
| 模型 | MAE | RMSE | H(P,Q) | 
|---|---|---|---|
| CNN | 0.071±0.010 | 0.086±0.035 | 0.065±0.010 | 
| LSTM | 0.075±0.040 | 0.060±0.039 | 0.023±0.013 | 
| ConvLSTM | 0.069±0.011 | 0.043±0.002 | 0.084±0.010 | 
| PVTv2 | 0.041±0.021 | 0.045±0.011 | 0.038±0.014 | 
| PVTv2-CBAM | 0.042±0.029 | 0.034±0.017 | 0.033±0.011 | 
表3 不同模型在T-Driver数据集上的准确性评价指标
Tab. 3 Evaluation metrics of accuracy of different models on T-Driver dataset
| 模型 | MAE | RMSE | H(P,Q) | 
|---|---|---|---|
| CNN | 0.071±0.010 | 0.086±0.035 | 0.065±0.010 | 
| LSTM | 0.075±0.040 | 0.060±0.039 | 0.023±0.013 | 
| ConvLSTM | 0.069±0.011 | 0.043±0.002 | 0.084±0.010 | 
| PVTv2 | 0.041±0.021 | 0.045±0.011 | 0.038±0.014 | 
| PVTv2-CBAM | 0.042±0.029 | 0.034±0.017 | 0.033±0.011 | 
| PVTv2 (基准) | CBAM | 差分隐私 | Yellow_tripdata | T-Driver | ||||
|---|---|---|---|---|---|---|---|---|
| MAE | RMSE | H(P,Q) | MAE | RMSE | H(P,Q) | |||
| ✓ | 0.097±0.083 | 0.127±0.073 | 0.051±0.013 | 0.041±0.021 | 0.045±0.011 | 0.038±0.014 | ||
| ✓ | ✓ | 0.060±0.007 | 0.093±0.019 | 0.053±0.005 | 0.042±0.029 | 0.034±0.017 | 0.033±0.011 | |
| ✓ | ✓ | 0.094±0.007 | 0.137±0.051 | 0.089±0.021 | 0.039±0.014 | 0.051±0.016 | 0.038±0.018 | |
| ✓ | ✓ | ✓ | 0.054±0.015 | 0.089±0.030 | 0.058±0.012 | 0.042±0.003 | 0.037±0.021 | 0.034±0.011 | 
表4 消融实验结果
Tab. 4 Results of ablation experiments
| PVTv2 (基准) | CBAM | 差分隐私 | Yellow_tripdata | T-Driver | ||||
|---|---|---|---|---|---|---|---|---|
| MAE | RMSE | H(P,Q) | MAE | RMSE | H(P,Q) | |||
| ✓ | 0.097±0.083 | 0.127±0.073 | 0.051±0.013 | 0.041±0.021 | 0.045±0.011 | 0.038±0.014 | ||
| ✓ | ✓ | 0.060±0.007 | 0.093±0.019 | 0.053±0.005 | 0.042±0.029 | 0.034±0.017 | 0.033±0.011 | |
| ✓ | ✓ | 0.094±0.007 | 0.137±0.051 | 0.089±0.021 | 0.039±0.014 | 0.051±0.016 | 0.038±0.018 | |
| ✓ | ✓ | ✓ | 0.054±0.015 | 0.089±0.030 | 0.058±0.012 | 0.042±0.003 | 0.037±0.021 | 0.034±0.011 | 
| Yellow_tripdata | T-Driver | |||||
|---|---|---|---|---|---|---|
| MAE | RMSE | H(P,Q) | MAE | RMSE | H(P,Q) | |
| 0.5 | 0.031 | 0.045 | 0.022 | 0.027 | 0.041 | 0.038 | 
| 1.0 | 0.024 | 0.046 | 0.021 | 0.026 | 0.037 | 0.039 | 
| 2.0 | 0.017 | 0.040 | 0.018 | 0.024 | 0.039 | 0.036 | 
| 4.0 | 0.016 | 0.038 | 0.019 | 0.025 | 0.042 | 0.034 | 
表5 不同隐私预算下的预测指标
Tab. 5 Prediction metrics with different privacy budgets
| Yellow_tripdata | T-Driver | |||||
|---|---|---|---|---|---|---|
| MAE | RMSE | H(P,Q) | MAE | RMSE | H(P,Q) | |
| 0.5 | 0.031 | 0.045 | 0.022 | 0.027 | 0.041 | 0.038 | 
| 1.0 | 0.024 | 0.046 | 0.021 | 0.026 | 0.037 | 0.039 | 
| 2.0 | 0.017 | 0.040 | 0.018 | 0.024 | 0.039 | 0.036 | 
| 4.0 | 0.016 | 0.038 | 0.019 | 0.025 | 0.042 | 0.034 | 
| 1 | 李德仁,邵振峰,于文博,等.基于时空位置大数据的公共疫情防控服务让城市更智慧[J].武汉大学学报(信息科学版), 2020, 45(4): 475-487. | 
| LI D R, SHAO Z F, YU W B, et al. Public epidemic prevention and control services based on big data of spatiotemporal location make cities more smart [J]. Geomatics and Information Science of Wuhan University, 2020, 45(4): 475-487. | |
| 2 | PAN X, CAI X R, SONG K, et al. Location recommendation based on mobility graph with individual and group influences [J]. IEEE Transactions on Intelligent Transportation Systems, 2023, 24(8): 8409-8420. | 
| 3 | AGARWAL R, HUSSAIN M. Generic framework for privacy preservation in cyber-physical systems [C]// Proceedings of the 2019 International Conference on Advanced Computing and Intelligent Engineering, AISC 1198. Singapore: Springer, 2021: 257-266. | 
| 4 | CHANG V, MOU Y, XU Q A. The ethical issues of location-based services on big data and IoT [C]// Proceedings of the 2020 International Conference on Industrial IoT, Big Data and Supply Chain, SIST 218. Singapore: Springer, 2021: 195-205. | 
| 5 | SHI X, CHEN Z, WANG H, et al. Convolutional LSTM network: a machine learning approach for precipitation nowcasting [C]// Proceedings of the 28th International Conference on Neural Information Processing Systems. Cambridge: MIT Press, 2015: 802-810. | 
| 6 | XIONG L, DING W, HUANG X, et al. CLSTAN: ConvLSTM-based spatiotemporal attention network for traffic flow forecasting [J]. Mathematical Problems in Engineering, 2022, 2022: No.1604727. | 
| 7 | HE R, LIU Y, XIAO Y, et al. Deep spatio-temporal 3D densenet with multiscale ConvLSTM-Resnet network for citywide traffic flow forecasting [J]. Knowledge-Based Systems, 2022, 250: No.109054. | 
| 8 | 夏进,王正群,朱世明.基于时间序列分解的交通流量预测模型[J].计算机应用, 2023, 43(4): 1129-1135. | 
| XIA J, WANG Z Q, ZHU S M. Traffic flow prediction model based on time series decomposition [J]. Journal of Computer Applications, 2023, 43(4): 1129-1135. | |
| 9 | HOCHREITER S, SCHMIDHUBER J. Long short-term memory [J]. Neural Computation, 1997, 9(8): 1735-1780. | 
| 10 | 晏燕,丛一鸣, MAHMOOD A,等.基于深度学习的位置大数据统计发布与隐私保护方法[J].通信学报, 2022, 43(1): 203-216. | 
| YAN Y, CONG Y M, MAHMOOD A, et al. Statistics release and privacy protection method of location big data based on deep learning [J]. Journal on Communications, 2022, 43(1): 203-216. | |
| 11 | 梁天恺,曾碧,陈光.联邦学习综述:概念、技术、应用与挑战[J].计算机应用, 2022, 42(12): 3651-3662. | 
| LIANG T K, ZENG B, CHEN G. Federated learning survey: concepts, technologies, applications and challenges [J]. Journal of Computer Applications, 2022, 42(12): 3651-3662. | |
| 12 | McMAHAN H B, MOORE E, RAMAGE D, et al. Communication-efficient learning of deep networks from decentralized data [C]// Proceedings of the 20th International Conference on Artificial Intelligence and Statistics. New York: JMLR.org, 2017: 1273-1282. | 
| 13 | 肖雄,唐卓,肖斌,等.联邦学习的隐私保护与安全防御研究综述[J].计算机学报, 2023, 46(5): 1019-1044. | 
| XIAO X, TANG Z, XIAO B, et al. A survey on privacy and security issues in federated learning [J]. Chinese Journal of Computers, 2023, 46(5): 1019-1044. | |
| 14 | ZHU L, LIU Z, HAN S. Deep leakage from gradients [C]// Proceedings of the 33rd International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2019: 14774-14784. | 
| 15 | WANG Z, PENG C, HE X, et al. Wasserstein distance-based deep leakage from gradients [J]. Entropy, 2023, 25(5): No.810. | 
| 16 | SONG C, RISTENPART T, SHMATIKOV V. Machine learning models that remember too much [C]// Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. New York: ACM, 2017: 587-601. | 
| 17 | YIN H, MALLYA A, VAHDAT A, et al. See through gradients: image batch recovery via gradinversion [C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 16332-16341. | 
| 18 | LAM M, WEI G Y, BROOKS D, et al. Gradient disaggregation: breaking privacy in federated learning by reconstructing the user participant matrix [C]// Proceedings of the 38th International Conference on Machine Learning. New York: JMLR.org, 2021: 5959-5968. | 
| 19 | DWORK C. Differential privacy [C]// Proceedings of the 2006 International Colloquium on Automata, Languages, and Programming, LNCS 4052. Berlin: Springer, 2006: 1-12. | 
| 20 | DWORK C. Differential privacy: a survey of results [C]// Proceedings of the 2008 International conference on Theory and Applications of Models of Computation, LNCS 4978. Berlin: Springer, 2008: 1-19. | 
| 21 | ABADI M, CHU A, GOODFELLOW I, et al. Deep learning with differential privacy [C]// Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. New York: ACM, 2016: 308-318. | 
| 22 | DING X, CHEN L, ZHOU P, et al. Differentially private deep learning with iterative gradient descent optimization [J]. ACM/IMS Transactions on Data Science, 2021, 2(4): No.34. | 
| 23 | ADNAN M, KALRA S, CRESSWELL J C, et al. Federated learning and differential privacy for medical image analysis [J]. Scientific Reports, 2022, 12: No.1953. | 
| 24 | WU X, ZHANG Y, SHI M, et al. An adaptive federated learning scheme with differential privacy preserving [J]. Future Generation Computer Systems, 2022, 127: 362-372. | 
| 25 | HUANG X, DING Y, JIANG Z L, et al. DP-FL: a novel differentially private federated learning framework for the unbalanced data [J]. World Wide Web, 2020, 23: 2529-2545. | 
| 26 | ZHAO J, MAO K, HUANG C, et al. Utility optimization of federated learning with differential privacy [J]. Discrete Dynamics in Nature and Society, 2021, 2021: No.3344862. | 
| 27 | WANG W, XIE E, LI X, et al. PVT v2: improved baselines with Pyramid Vision Transformer [J]. Computational Visual Media, 2022, 8(3): 415-424. | 
| 28 | WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module [C]// Proceedings of the 2018 European Conference on Computer Vision, LNCS 11211. Cham: Springer, 2018: 3-19. | 
| 29 | LI T, SAHU A K, ZAHEER M, et al. Federated optimization in heterogeneous networks [EB/OL]. [2023-10-02]. . | 
| 30 | DWORK C, McSHERRY F, NISSIM K, et al. Calibrating noise to sensitivity in private data analysis [J]. Journal of Privacy and Confidentiality, 2016, 7(3): 17-51. | 
| 31 | DWORK C, ROTH A. The algorithmic foundations of differential privacy [J]. Foundations and Trends in Theoretical Computer Science, 2014, 9(3/4): 211-407. | 
| 32 | 段聪颖,陈思光.基于联邦深度学习的皮肤病智能诊断研究[J].生物信息学, 2024, 22(2): 101-108. | 
| DUAN C Y, CHEN S G. Federated deep learning-based intelligent diagnosis for skin lesion [J]. Chinese Journal of Bioinformatics, 2024, 22(2): 101-108. | 
| [1] | 郑宗生, 杜嘉, 成雨荷, 赵泽骋, 张月维, 王绪龙. 用于红外-可见光图像分类的跨模态双流交替交互网络[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 275-283. | 
| [2] | 徐欣然, 张绍兵, 成苗, 张洋, 曾尚. 基于多路层次化混合专家模型的轴承故障诊断方法[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 59-68. | 
| [3] | 朱亮, 慕京哲, 左洪强, 谷晶中, 朱付保. 基于联邦图神经网络的位置隐私保护推荐方案[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 136-143. | 
| [4] | 梁杰涛, 罗兵, 付兰慧, 常青玲, 李楠楠, 易宁波, 冯其, 何鑫, 邓辅秦. 基于坐标几何采样的点云配准方法[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 214-222. | 
| [5] | 张思齐, 张金俊, 王天一, 秦小林. 基于信号时态逻辑的深度时序事件检测算法[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 90-97. | 
| [6] | 张淑芬, 张宏扬, 任志强, 陈学斌. 联邦学习的公平性综述[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 1-14. | 
| [7] | 潘烨新, 杨哲. 基于多级特征双向融合的小目标检测优化模型[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2871-2877. | 
| [8] | 张治政, 张啸剑, 王俊清, 冯光辉. 结合差分隐私与安全聚集的联邦空间数据发布方法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2777-2784. | 
| [9] | 陈廷伟, 张嘉诚, 王俊陆. 面向联邦学习的随机验证区块链构建[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2770-2776. | 
| [10] | 黄云川, 江永全, 黄骏涛, 杨燕. 基于元图同构网络的分子毒性预测[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2964-2969. | 
| [11] | 秦璟, 秦志光, 李发礼, 彭悦恒. 基于概率稀疏自注意力神经网络的重性抑郁疾患诊断[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2970-2974. | 
| [12] | 王熙源, 张战成, 徐少康, 张宝成, 罗晓清, 胡伏原. 面向手术导航3D/2D配准的无监督跨域迁移网络[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2911-2918. | 
| [13] | 李顺勇, 李师毅, 胥瑞, 赵兴旺. 基于自注意力融合的不完整多视图聚类算法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2696-2703. | 
| [14] | 刘禹含, 吉根林, 张红苹. 基于骨架图与混合注意力的视频行人异常检测方法[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2551-2557. | 
| [15] | 顾焰杰, 张英俊, 刘晓倩, 周围, 孙威. 基于时空多图融合的交通流量预测[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2618-2625. | 
| 阅读次数 | ||||||
| 全文 |  | |||||
| 摘要 |  | |||||