计算机应用 ›› 2017, Vol. 37 ›› Issue (10): 2938-2945.DOI: 10.11772/j.issn.1001-9081.2017.10.2938

• 数据科学与技术 • 上一篇    下一篇

基于差分隐私的轨迹模式挖掘算法

金凯忠, 彭慧丽, 张啸剑   

  1. 河南财经政法大学 计算机与信息工程学院, 郑州 450002
  • 收稿日期:2017-05-05 修回日期:2017-07-28 出版日期:2017-10-10 发布日期:2017-10-16
  • 通讯作者: 张啸剑(1980-),男,河南周口人,副教授,博士,CCF会员,主要研究方向:差分隐私、数据库,E-mail:xjzhang82@ruc.edu.cn
  • 作者简介:金凯忠(1991-),男,河南开封人,硕士研究生,主要研究方向:差分隐私、数据库;彭慧丽(1981-),女,河南周口人,讲师,硕士,主要研究方向:数据库、隐私保护;张啸剑(1980-),男,河南周口人,副教授,博士,CCF会员,主要研究方向:差分隐私、数据库.
  • 基金资助:
    国家自然科学基金资助项目(61502146,91646203);河南省自然科学基金资助项目(162300410006);河南省科技攻关项目(162102310411);河南省教育厅高等学校重点科研项目(16A520002);河南财经政法大学青年拔尖人才项目。

Trajectory pattern mining with differential privacy

JIN Kaizhong, PENG Huili, ZHANG Xiaojian   

  1. School of Computer & Information Engineering, Henan University of Economics and Law, Zhengzhou Henan 450002, China
  • Received:2017-05-05 Revised:2017-07-28 Online:2017-10-10 Published:2017-10-16
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (61502146, 91646203), the Natural Science of Henan Province (162300410006), the Science and Technology Project of Henan Province (162102310411), the Key Scientific Research Projects of Education Department of Henan Province (16A520002), the Youth Top-notch Talent Project of Henan University of Economics and Law.

摘要: 针对现有基于差分隐私的频繁轨迹模式挖掘算法全局敏感度过高、挖掘结果可用性较低的问题,提出一种基于前缀序列格和轨迹截断的差分隐私下频繁轨迹模式挖掘算法--LTPM。该算法首先利用自适应的方法获得最优截断长度,然后采用一种动态规划的策略对原始数据库进行截断处理,在此基础上,利用等价关系构建前缀序列格,并挖掘频繁轨迹模式。理论分析表明LTPM算法满足ε-差分隐私;实验结果表明,LTPM算法的准确率(TPR)和平均相对误差(ARE)明显优于N-gram和Prefix算法,能有效提高挖掘结果的可用性。

关键词: 差分隐私, 隐私保护, 频繁模式挖掘, 轨迹截断, 前缀序列格

Abstract: To address the problems of high global query sensitivity and low utility of mining results in the existing works, a Lattice-Trajectory Pattern Mining (LTPM) algorithm based on prefix sequence lattice and trajectory truncation was proposed for mining sequential patterns with differential privacy. An adaptive method was employed to obtain the optimal truncation length, and a dynamic programming strategy was used to truncate the original database. Based on the truncated database, the equivalent relation was used to construct the prefix sequence lattice for mining trajectory patterns. Theoretical analysis shows that LTPM satisfies ε-differential privacy. The experimental results show that the True Postive Rate (TPR) and Average Relative Error (ARE) of LTPM are better than those of N-gram and Prefix algorithms, which verifies that LTPM can effectively improve the utility of the mining results.

Key words: differential privacy, privacy protection, frequent pattern mining, trajectory truncation, prefix sequential lattice

中图分类号: