Journal of Computer Applications ›› 2020, Vol. 40 ›› Issue (11): 3242-3248.DOI: 10.11772/j.issn.1001-9081.2020030379

• Cyber security • Previous Articles     Next Articles

High-precision histogram publishing method based on differential privacy

LI Kunming1, WANG Chaoqian2, NI Weiwei2, BAO Xiaohan2   

  1. 1. Smart Grid Service Center, Jiangsu Frontier Electric Technology Company Limited, Nanjing Jiangsu 210000, China;
    2. College of Computer Science and Engineering, Southeast University, Nanjing Jiangsu 211189, China
  • Received:2020-03-30 Revised:2020-05-29 Online:2020-11-10 Published:2020-06-10
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (61772131).

基于差分隐私的高精度直方图发布方法

李昆明1, 王超迁2, 倪巍伟2, 鲍晓涵2   

  1. 1. 江苏方天电力技术有限公司 智能电网服务中心, 南京 210000;
    2. 东南大学 计算机科学与工程学院, 南京 211189
  • 通讯作者: 倪巍伟(1979-),男,江苏淮安人,教授,博士生导师,博士,CCF会员,主要研究方向:数据隐私保护、数据挖掘、复杂数据管理;wni@seu.edu.cn
  • 作者简介:李昆明(1984-),男,黑龙江安达人,工程师,主要研究方向:大数据、数据隐私保护;王超迁(1997-),男,河北邢台人,硕士研究生,主要研究方向:数据隐私保护;鲍晓涵(1996-),女,安徽宣城人,硕士研究生,主要研究方向:数据隐私保护
  • 基金资助:
    国家自然科学基金资助项目(61772131)。

Abstract: Aiming at the problem that the existing privacy protection histogram publishing methods based on grouping to suppress differential noise errors cannot effectively balance the group approximation error and the Differential Privacy (DP) Laplacian error, resulting in the lack of histogram availability, a High-Precision Histogram Publishing method (HPHP) was proposed. First, the constraint inference method was used to achieve the histogram ordering under the premise of satisfying the DP constraints. Then, based on the ordered histogram, the dynamic programming grouping method was used to generate groups with the smallest total error on the noise-added histogram. Finally, the Laplacian noise was added to each group mean. For the convenience of comparative analysis, the privacy protection histogram publishing method with the theoretical minimum error (Optimal) was proposed. Experimental analysis results between HPHP, DP method with noise added directly, AHP (Accurate Histogram Publication) method and Optimal show that the Kullback-Leibler Divergence (KLD) of the histogram published by HPHP is reduced by 90% compared to that of AHP method and is close to the effect of Optimal. In conclusion, under the same pre-conditions, HPHP can publish higher-precision histograms on the premise of ensuring DP.

Key words: histogram, Differential Privacy (DP), constraint inference, global grouping, dynamic programming

摘要: 针对已有基于分组平抑差分噪声误差的隐私保护直方图发布方法无法有效均衡分组近似误差与差分隐私(DP)拉普拉斯误差,从而造成直方图可用性缺失的问题,提出基于差分隐私的高精度直方图发布方法(HPHP)。首先,采用约束推断方法,在满足DP约束的前提下实现直方图排序;然后,基于有序直方图,采用动态规划分组方法在添加噪声的直方图上生成具有最小总误差的分组;最后,在各组均值上添加拉普拉斯噪声。方便对比分析起见,提出具有理论最小误差的隐私保护直方图发布方法(Optimal)。将HPHP与直接添加噪声的DP方法、AHP方法以及Optimal进行实验分析,实验结果表明:相较于AHP方法,HPHP所发布直方图的Kullback-Leibler散度(KLD)能够降低90%,接近Optimal的效果。因此,在相同的预置条件下,HPHP可以在保证满足DP的前提下发布更高精度的直方图。

关键词: 直方图, 差分隐私, 约束推断, 全局分组, 动态规划

CLC Number: