结合差分隐私与安全聚集的联邦空间数据发布方法

doi:10.11772/j.issn.1001-9081.2023091296

《计算机应用》唯一官方网站 ›› 2024, Vol. 44 ›› Issue (9): 2777-2784.DOI: 10.11772/j.issn.1001-9081.2023091296

结合差分隐私与安全聚集的联邦空间数据发布方法

张治政¹^,², 张啸剑¹^,², 王俊清¹^,²(), 冯光辉³

^1.河南财经政法大学计算机与信息工程学院, 郑州 450046
^2.郑州市人工智能安全与隐私保护重点实验室(河南财经政法大学), 郑州 450046
^3.广州大学计算机科学与网络工程学院, 广州 510006

收稿日期:2023-09-20 修回日期:2023-12-26 接受日期:2023-12-29 发布日期:2024-02-22 出版日期:2024-09-10
通讯作者: 王俊清
作者简介:张治政（1997—），男，河南信阳人，硕士研究生，主要研究方向：差分隐私、联邦分析
张啸剑（1980—），男，河南郑州人，教授，博士，CCF会员，主要研究方向：差分隐私、机器学习、图数据管理
王俊清（1999—），男，河南驻马店人，硕士研究生，主要研究方向：隐私保护、差分隐私
冯光辉（1982—），男，河南驻马店人，博士研究生，主要研究方向：隐私保护、差分隐私。
基金资助:
国家自然科学基金资助项目(62072156)

Federated spatial data publication method with differential privacy and secure aggregation

Zhizheng ZHANG¹^,², Xiaojian ZHANG¹^,², Junqing WANG¹^,²(), Guanghui FENG³

^1.School of Computer and Information Engineering，Henan University of Economics and Law，Zhengzhou Henan 450046，China
^2.Zhengzhou Key Laboratory of Artificial Intelligence Security and Privacy Protection （Henan University of Economics and Law），Zhengzhou Henan 450046，China
^3.School of Computer Science and Cyber Engineering，Guangzhou University，Guangzhou Guangdong 510006，China

Received:2023-09-20 Revised:2023-12-26 Accepted:2023-12-29 Online:2024-02-22 Published:2024-09-10
Contact: Junqing WANG
About author:ZHANG Zhizheng， born in 1997， M. S. candidate. His research interests include differential privacy， federated analysis.
ZHANG Xiaojian， born in 1980， Ph. D.， professor. His research interests include differential privacy， machine learning， graph data management.
FENG Guanghui， born in 1982， Ph. D. candidate. His research interests include privacy protection， differential privacy.
Supported by:
National Natural Science Foundation of China(62072156)

摘要/Abstract

摘要：

针对联邦空间数据的数据孤岛问题、空间数据索引问题以及发布联邦空间数据存在的隐私问题，提出基于动态四分树的联邦空间数据发布（FSP）方法。首先，在FSP方法的每轮迭代中，服务端把四分树副本共享给该轮中每个客户端，每个客户端利用四分树副本编码自身位置数据，利用Polya分布产生离散噪声在本地扰动编码结果；其次，结合容错学习（LWE）生成本地掩码对噪声结果进行加密；再次，安全聚集端结合该轮迭代中每个客户端的报告值，执行安全聚集与消除掩码操作，然后把聚集结果发送给服务端；最后，服务端结合收集的编码向量与噪声方差自底向上地动态修剪四分树结构。在Beijing、Checkin、NYC和Landmark 4个空间数据集上的实验结果表明，FSP方法在保证客户端隐私的同时，与已有的较好的联邦空间数据发布方法AHH（Adaptive Hierarchical Histograms）相比，在隐私预算为1.8时，FSP的均方误差（MSE）分别降低了3.80％、2.96％、7.51％和14.13％。可见使用FSP方法进行联邦空间数据发布的精度优于同类方法。

关键词: 联邦分析, 分布式差分隐私, 安全聚集, 空间数据发布, 四分树

Abstract:

Aiming at the problems of federated spatial data isolation， spatial data indexing， and privacy of publishing spatial data， a Federated Spatial data Publishing （FSP） method based on dynamic quad-tree was proposed. Firstly， in each iteration of the FSP method， quad-tree replica was shared by the server with each client in the round， and each client encoded its own location data using the quad-tree replica， and discrete noise was generated through Polya distribution for locally perturbing the encoding results. Secondly， local masks were generated through LWE （Learning With Error） to encrypt the noisy results. Thirdly， the reported values from each client in the iteration were combined by the aggregator to perform secure aggregation and mask elimination. Then the aggregated results were sent to the server. The quad-tree structure was pruned by the server dynamically in a bottom-up way based on the collected encoding vectors and noise variance. Experimental results on four spatial datasets Beijing， Checkin， NYC， and Landmark show that the FSP method not only ensures client privacy， but also reduces the Mean Squared Error （MSE） in federated spatial data publication by 3.80%， 2.96%， 7.51% and 14.13% at a privacy budget of 1.8， respectively， compared to the existing better federated spatial data publication method AHH （Adaptive Hierarchical Histograms）. This indicates that the FSP method achieves higher precision than similar methods in federated spatial data publishing.

Key words: federated analytics, distributed differential privacy, secure aggregation, spatial data publication, quad-tree

中图分类号:

TP309

张治政, 张啸剑, 王俊清, 冯光辉. 结合差分隐私与安全聚集的联邦空间数据发布方法[J]. 计算机应用, 2024, 44(9): 2777-2784.

Zhizheng ZHANG, Xiaojian ZHANG, Junqing WANG, Guanghui FENG. Federated spatial data publication method with differential privacy and secure aggregation[J]. Journal of Computer Applications, 2024, 44(9): 2777-2784.

图/表 8

参考文献 23

1	McMAHAN B， MOORE E， RAMAGE D， et al. Communication-efficient learning of deep networks from decentralized data ［C］// Proceedings of the 20th International Conference on Artificial Intelligence and Statistics. New York： PMLR， 2017， 54： 1273-1282.
2	RAMAGE D， MAZZOCCHI S. Federated analytics： collaborative data science without data collection ［EB/OL］. ［2022-11-11］. .
3	BAGDASARYAN E， KAIROUZ P， MELLEM S， et al. Towards sparse federated analytics： location heatmaps under distributed differential privacy with secure aggregation ［J］. Proceedings on Privacy Enhancing Technologies， 2022（4）： 162-182.
4	CORMODE G， PROCOPIUC C， SRIVASTAVA D， et al. Differentially private spatial decompositions ［C］// Proceedings of the 2012 IEEE 28th International Conference on Data Engineering. Piscataway： IEEE， 2012： 20-31.
5	张治政. 结合差分隐私与安全聚集的联邦空间数据分析关键技术研究［D］.郑州：河南财经政法大学，2023： 36-43.
	ZHANG Z Z. Research on federated spatial data analysis combing differential privacy and secure aggregation ［D］. Zhengzhou： Henan University of Economics and Law， 2023： 36-43.
6	REGEV O. The learning with errors problem （invited survey）［C］// Proceedings of the 2010 IEEE 25th Annual Conference on Computational Complexity. Piscataway： IEEE， 2010： 191-204.
7	QARDAJI W， YANG W， LI N H. Differentially private grids for geospatial data ［C］// Proceedings of the 2013 IEEE 29th International Conference on Data Engineering. Piscataway： IEEE， 2013： 757-768.
8	张啸剑，付楠，孟小峰. 基于本地差分隐私的空间范围查询方法［J］. 计算机研究与发展， 2020， 57（4）： 847-858.
	ZHANG X J， FU N， MENG X F. Towards spatial range queries under local differential privacy ［J］. Journal of Computer Research and Development， 2020， 57（4）： 847-858.
9	WANG T， BLOCKI J， LI N， et al. Locally differentially private protocols for frequency estimation ［C］// Proceedings of the 26th USENIX Security Symposium. Berkeley： USENIX Association， 2017： 729-745.
10	GHAZI B， HE J， KOHLHOFF K， et al. Differentially private heatmaps ［C］// Proceedings of the 2023 AAAI Conference on Artificial Intelligence. Palo Alto： AAAI Press， 2023， 37（6）： 7696-7704.
11	ZHU W， KAIROUZ P， McMAHAN B， et al. Federated heavy hitters discovery with differential privacy ［C］// Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics. New York： PMLR， 2020， 108： 3837-3847.
12	LI M， ZENG Y， CHEN L. Efficient and accurate range counting on privacy-preserving spatial data federation ［C］// Proceedings of the 28th International Conference on Database Systems for Advanced Applications. Cham： Springer， 2023： 317-333.
13	BONAWITZ K， IVANOV V， KREUTER B， et al. Practical secure aggregation for privacy-preserving machine learning ［C］// Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. New York： ACM， 2017： 1175-1191.
14	KAIROUZ P， LIU Z， STEINKE T. The distributed discrete gaussian mechanism for federated learning with secure aggregation［C］// Proceedings of the 38th International Conference on Machine Learning. New York： PMLR， 2021， 139： 5201-5212.
15	AGARWAL N， KAIROUZ P， LIU Z. The skellam mechanism for differentially private federated learning ［C］// Proceedings of the 2021 Advances in Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2021， 34： 5052-5064.
16	CHEN W-N， CHOQUETTE-CHOO C A， KAIROUZ P， et al. The fundamental price of secure aggregation in differentially private federated learning［C］// Proceedings of the 39th International Conference on Machine Learning. New York： PMLR， 2022， 162： 3056-3089.
17	CHEN W-N， OZGUR C A， KAIROUZ P. The Poisson binomial mechanism for unbiased federated learning with secure aggregation［C］// Proceedings of the 2022 International Conference on Machine Learning. New York： PMLR， 2022， 162： 3490-3506.
18	FRANKLIN M， YUNG M. Communication complexity of secure computation ［C］// Proceedings of the 24th Annual ACM Symposium on Theory of Computing. New York： ACM， 1992： 699-710.
19	STEVENS T， SKALKA C， VINCENT C， et al. Efficient differentially private secure aggregation for federated learning via hardness of learning with errors ［C］// Proceedings of the 31st USENIX Security Symposium. Berkeley： USENIX Association， 2022： 1379-1395.
20	GORYCZKA S， XIONG L. A comprehensive comparison of multiparty secure additions with differential privacy ［J］. IEEE Transactions on Dependable and Secure Computing， 2017， 14（5）： 463-477.
21	ZHANG J， XIAO X， XIE X. PrivTree： a differentially private algorithm for hierarchical decompositions ［C］// Proceedings of the 2016 International Conference on Management of Data. New York： ACM， 2016： 155-170.
22	CHOWDHURY S R， ZHOU X. Distributed differential privacy in multi-armed bandits ［EB/OL］. ［2023-03-17］. .
23	KAIROUZ P， McMAHAN H B， AVENT B， et al. Advances and open problems in federated learning ［J］. Foundations and Trends in Machine Learning， 2021， 14（1/2）： 1-210.

数据集	坐标范围	实例大小	采样大小
Beijing	［39.6°N，40.2°N］×［116.1°E，116.7°E］	15 000 000	1 000 000
Checkin	［48.2°S，90.0°N］×［176.3°W，177.46°E］	1 000 000	500 000
NYC	［40.6°N，40.9°N］× ［74.05°W，73.75°W］	10 000 000	1 000 000
Landmark	［24.6°N，49.0°N］× ［124.4°W，67.0°W］	870 051	500 000

数据集	坐标范围	实例大小	采样大小
Beijing	［39.6°N，40.2°N］×［116.1°E，116.7°E］	15 000 000	1 000 000
Checkin	［48.2°S，90.0°N］×［176.3°W，177.46°E］	1 000 000	500 000
NYC	［40.6°N，40.9°N］× ［74.05°W，73.75°W］	10 000 000	1 000 000
Landmark	［24.6°N，49.0°N］× ［124.4°W，67.0°W］	870 051	500 000

方法	Beijing数据集上的MSE/10^-11			Checkin数据集上的MSE/10^-11			NYC数据集上的MSE/10^-11			Landmark数据集上的MSE/10^-11
方法	ε=1.8	ε=3.0	ε=6.0	ε=1.8	ε=3.0	ε=6.0	ε=1.8	ε=3.0	ε=6.0	ε=1.8	ε=3.0	ε=6.0
Flat	1 710.75	1 354.33	1 243.90	2 282.45	1 328.98	1 226.04	1 693.04	1 330.68	1 221.21	1 691.63	1 331.65	1 226.18
HOHE	1 187.50	828.52	746.87	2 071.88	964.84	846.87	1 425.00	1 025.00	746.87	1 275.00	828.52	796.09
Quad-geo	855.34	796.48	756.81	841.64	797.83	746.87	842.66	789.61	748.11	820.20	798.38	748.41
AHH	7.89	7.76	7.66	541.00	541.00	525.00	8.52	8.27	7.90	26.90	26.10	24.40
FSP	7.59	7.53	7.43	525.00	525.00	524.00	7.88	7.72	7.65	23.10	22.60	22.20

方法	Beijing数据集上的MSE/10^-11			Checkin数据集上的MSE/10^-11			NYC数据集上的MSE/10^-11			Landmark数据集上的MSE/10^-11
方法	ε=1.8	ε=3.0	ε=6.0	ε=1.8	ε=3.0	ε=6.0	ε=1.8	ε=3.0	ε=6.0	ε=1.8	ε=3.0	ε=6.0
Flat	1 710.75	1 354.33	1 243.90	2 282.45	1 328.98	1 226.04	1 693.04	1 330.68	1 221.21	1 691.63	1 331.65	1 226.18
HOHE	1 187.50	828.52	746.87	2 071.88	964.84	846.87	1 425.00	1 025.00	746.87	1 275.00	828.52	796.09
Quad-geo	855.34	796.48	756.81	841.64	797.83	746.87	842.66	789.61	748.11	820.20	798.38	748.41
AHH	7.89	7.76	7.66	541.00	541.00	525.00	8.52	8.27	7.90	26.90	26.10	24.40
FSP	7.59	7.53	7.43	525.00	525.00	524.00	7.88	7.72	7.65	23.10	22.60	22.20

[1]	高瑞, 陈学斌, 张祖篡. 面向部分图更新的动态社交网络隐私发布方法[J]. 《计算机应用》唯一官方网站, 2024, 44(12): 3831-3838.
[2]	项勇, 李艳俊, 黄丁韫, 陈愚, 谢惠琴. 全轮Shadow算法的差分和线性特征分析[J]. 《计算机应用》唯一官方网站, 2024, 44(12): 3839-3843.
[3]	赵振皓, 张仕斌, 万武南, 张金全, 秦智. 基于信誉值和强盲签名算法的委托权益证明共识算法[J]. 《计算机应用》唯一官方网站, 2024, 44(12): 3717-3722.
[4]	王伊婷, 万武南, 张仕斌, 张金全, 秦智. 基于SM9算法的可链接环签名方案[J]. 《计算机应用》唯一官方网站, 2024, 44(12): 3709-3716.
[5]	梁静, 万武南, 张仕斌, 张金全, 秦智. 面向主从链的慈善系统溯源存储模型[J]. 《计算机应用》唯一官方网站, 2024, 44(12): 3751-3758.
[6]	刘德渊, 张金全, 张鑫, 万武南, 张仕斌, 秦智. 基于无证书签密的跨链身份认证方案[J]. 《计算机应用》唯一官方网站, 2024, 44(12): 3731-3740.
[7]	张鑫, 张金全, 刘德渊, 万武南, 张仕斌, 秦智. 基于身份代理重加密的跨链身份管理方案[J]. 《计算机应用》唯一官方网站, 2024, 44(12): 3723-3730.
[8]	邓伊琳余发江. 基于LSTM和可分离自注意力机制的伪随机数生成器[J]. 《计算机应用》唯一官方网站, 0, (): 0-0.
[9]	张润莲王蒿唐瑞锋武小年. 基于均匀流型逼近与投影的高级加密标准算法相关功耗分析方法[J]. 《计算机应用》唯一官方网站, 0, (): 0-0.
[10]	刘运东汪学明. 基于穿刺伪随机函数的动态可搜索加密方案[J]. 《计算机应用》唯一官方网站, 0, (): 0-0.
[11]	张宏扬张淑芬谷铮. 面向个性化与公平性的联邦学习算法fedPF[J]. 《计算机应用》唯一官方网站, 0, (): 0-0.
[12]	姚梓豪马自强李扬魏良根. 基于冲突的缓存侧信道攻击与驱逐集研究综述[J]. 《计算机应用》唯一官方网站, 0, (): 0-0.
[13]	晏燕李飞飞吕雅琴冯涛. 安全高效的混洗差分隐私频率估计方法[J]. 《计算机应用》唯一官方网站, 0, (): 0-0.
[14]	彭海洋计卫星刘法旺. 基于区块链的自动驾驶仿真测试数据存证模型[J]. 《计算机应用》唯一官方网站, 0, (): 0-0.
[15]	闫润雨郭瑞闫永勃刘光军. 云中指定测试者的细粒度结果可验证搜索加密方案[J]. 《计算机应用》唯一官方网站, 0, (): 0-0.

结合差分隐私与安全聚集的联邦空间数据发布方法

Federated spatial data publication method with differential privacy and secure aggregation

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 8

参考文献 23

相关文章 15

编辑推荐

Metrics