Journal of Computer Applications ›› 2025, Vol. 45 ›› Issue (8): 2600-2611.DOI: 10.11772/j.issn.1001-9081.2024070911

• Cyber security • Previous Articles    

Secure and efficient frequency estimation method based on shuffled differential privacy

Yan YAN, Feifei LI(), Yaqin LYU, Tao FENG   

  1. School of Computer and Communication,Lanzhou University of Technology,Lanzhou Gansu 730050,China
  • Received:2024-06-30 Revised:2024-10-14 Accepted:2024-10-16 Online:2024-11-19 Published:2025-08-10
  • Contact: Feifei LI
  • About author:YAN Yan, born in 1980, Ph. D., professor. Her research interests include privacy protection, information security.
    LYU Yaqin, born in 2000, M. S. candidate. Her research interests include location privacy protection.
    FENG Tao, born in 1970, Ph. D., research fellow. His research interests include modern cryptography theory, information security.
  • Supported by:
    National Natural Science Foundation of China(62361036);Natural Science Foundation of Gansu Province(22JR5RA279)

安全高效的混洗差分隐私频率估计方法

晏燕, 李飞飞(), 吕雅琴, 冯涛   

  1. 兰州理工大学 计算机与通信学院,兰州 730050
  • 通讯作者: 李飞飞
  • 作者简介:晏燕(1980—),女,甘肃兰州人,教授,博士,CCF高级会员,主要研究方向:隐私保护、信息安全
    吕雅琴(2000—),女,山西晋中人,硕士研究生,主要研究方向:位置隐私保护
    冯涛(1970—),男,甘肃定西人,研究员,博士,主要研究方向:现代密码学理论、信息安全。
  • 基金资助:
    国家自然科学基金资助项目(62361036);甘肃省自然科学基金资助项目(22JR5RA279)

Abstract:

Shuffled Differential Privacy (SDP) models can balance the degree of privacy protection at user side and the usability of published results at server side. Therefore, they are more suitable for privacy-preserving big data collection and statistical publishing scenarios. Aiming at the problems of low shuffling efficiency and insufficient shuffling process security of the existing SDP frequency estimation methods, the following work was performed: firstly, an SDP Blind Signature Algorithm (SDPBSA) was designed on the basis of optimized elliptic curve to achieve discrimination of tampered or forged information, thereby improving the security of shuffling process. Then, a Matrix Column Rearrangement Transposition (MCRT) shuffling method was proposed to realize data shuffling by random matrix column rearrangement and matrix transposition operations, thereby improving the efficiency of shuffling process. Finally, above methods were combined to construct a complete SDP frequency estimation privacy protection framework — SM-SDP (SDP based on blind Signature and Matrix column rearrangement transposition), and its privacy and error level were analyzed theoretically. Experimental results on datasets such as Normal, Zipf, and IPUMS (Integrated Public Use Microdata Series) demonstrate that the MCRT shuffling method improves the shuffling efficiency by about 1 to 2 orders of magnitude compared to shuffling methods such as Fisher-Yates, ORShuffle (Oblivious Recursive Shuffling), and MRS (Message Random Shuffling); SM-SDP framework reduces the Mean Squared Error (MSE) by 2 to 11 orders of magnitude in the presence of different proportions of malicious data compared to frequency estimation methods such as mixDUMP, PSDP (Personalized Differential Privacy in Shuffle model), and HP-SDP (Histogram Publication with SDP).

Key words: privacy protection, frequency estimation, Shuffled Differential Privacy (SDP), blind signature, matrix operation

摘要:

混洗差分隐私(SDP)模型能兼顾用户端的隐私保护程度和服务器端发布结果的可用性,更适用于隐私保护的大数据收集和统计发布场景。针对目前SDP频率估计方法的洗牌效率较低和混洗过程安全性不足等问题,进行以下工作:首先,设计基于优化椭圆曲线的混洗差分隐私盲签名算法(SDPBSA),以实现对篡改或伪造信息的鉴别,提高混洗过程的安全性;其次,提出矩阵列重排转置(MCRT)洗牌方法,以利用随机的矩阵列重排和矩阵转置操作实现数据混洗,提高混洗过程的效率;最后,结合上述方法构建完整的SDP频率估计隐私保护框架——SM-SDP (SDP based on blind Signature and Matrix column rearrangement transposition),并通过理论分析讨论它的隐私性和误差级别。在Normal、Zipf和IPUMS (Integrated Public Use Microdata Series)等数据集上的实验结果表明,相较于Fisher-Yates、ORShuffle (Oblivious Recursive Shuffling)和MRS (Message Random Shuffling)等洗牌方法, MCRT洗牌方法的洗牌效率提升了1~2个数量级;相较于mixDUMP、PSDP (Personalized Differential Privacy in Shuffle model)和HP-SDP (Histogram Publication with SDP)等频率估计方法, SM-SDP框架在不同比例恶意数据存在时的均方误差(MSE)降低了2~11个数量级。

关键词: 隐私保护, 频率估计, 混洗差分隐私, 盲签名, 矩阵运算

CLC Number: