Journal of Computer Applications ›› 2013, Vol. 33 ›› Issue (02): 412-416.DOI: 10.3724/SP.J.1087.2013.00412

• Information security • Previous Articles     Next Articles

Clustering-based approach for multi-level anonymization

GUI Qiong,CHENG Xiaohui   

  1. College of Information Science and Engineering, Guilin University of Technology, Guilin Guangxi 541004, China
  • Received:2012-08-21 Revised:2012-10-06 Online:2013-02-01 Published:2013-02-25
  • Contact: GUI Qiong

基于聚类的分级匿名方法

桂琼,程小辉   

  1. 桂林理工大学 信息科学与工程学院,广西 桂林 541004
  • 通讯作者: 桂琼
  • 作者简介:桂琼(1972-),女,广西桂林人,副教授,硕士,主要研究方向:信息安全、数据挖掘;
    程小辉(1961-),男,江西樟树人,教授,博士,主要研究方向:信息安全、物联网、嵌入式系统。
  • 基金资助:
    国家自然科学基金资助项目;广西高等学校重大科研项目;广西教育厅科研项目

Abstract: To prevent the privacy disclosure caused by linking attack and reduce information loss resulting from anonymous protection, a (λα,k) multi-level anonymity model was proposed. According to the requirement of privacy preservation, sensitive attribute values could be divided into three levels: high, medium, and low. The risk of privacy disclosure was flexibly controlled by privacy protection degree parameter λ. On the basis of this, clustering-based approach for multi-level anonymization was proposed. The approach used a new hierarchical clustering algorithm and adopted more flexible strategies of data generalization for numerical attributes and classified attributes in a quasi-identifier. The experimental results show that the approach can meet the requirement of multi-level anonymous protection of sensitive attribute, and effectively reduce information loss.

Key words: privacy preservation, data publishing, data anonymization, multi-level, clustering, information loss

摘要: 为了防止链接攻击导致隐私的泄露,同时尽可能降低匿名保护时的信息损失,提出(λα, k)-分级匿名模型。该模型根据隐私保护的需求程度,将各敏感属性值划分为高、中、低三个等级类,通过隐私保护度参数λ灵活控制泄露风险。在此基础上,给出一种基于聚类的分级匿名方法。该方法采用一种新层次聚类算法,并针对准标识符中数值型属性与分类型属性采用灵活的概化策略。实验结果显示,该方法能够满足敏感属性的分级匿名保护需求,同时有效地减少信息损失。

关键词: 隐私保护, 数据发布, 数据匿名, 分级, 聚类, 信息损失

CLC Number: