计算机应用 ›› 2016, Vol. 36 ›› Issue (2): 392-396.DOI: 10.11772/j.issn.1001-9081.2016.02.0392

• 第三届CCF大数据学术会议(CCF BigData 2015) • 上一篇    下一篇

基于信息浓缩的隐私保护支持向量机分类算法

狄岚1, 于晓瞳1, 梁久祯2   

  1. 1. 江南大学 数字媒体学院, 江苏 无锡 214122;
    2. 江南大学 物联网工程学院, 江苏 无锡 214122
  • 收稿日期:2015-08-29 修回日期:2015-09-14 出版日期:2016-02-10 发布日期:2016-02-03
  • 通讯作者: 于晓瞳(1989-),男,山东青岛人,硕士研究生,主要研究方向:数字图像处理、数据挖掘。
  • 作者简介:狄岚(1965-),女,江苏南京人,副教授,硕士,CCF会员,主要研究方向:模式识别、数字图像处理;梁久祯(1968-),男,山东泰安人,教授,博士,CCF会员,主要研究方向:计算机视觉、模式识别。
  • 基金资助:
    江苏省六大人才高峰项目(DZXX-028);江苏省产学研项目(BY2014023-33)。

Classification algorithm of support vector machine with privacy preservation based on information concentration

DI Lan1, YU Xiaotong1, LIANG Jiuzhen2   

  1. 1. Shool of Digital Media, Jiangnan University, Wuxi Jiangsu 214122, China;
    2. Shool of IoT Engineering, Jiangnan University, Wuxi Jiangsu 214122, China
  • Received:2015-08-29 Revised:2015-09-14 Online:2016-02-10 Published:2016-02-03

摘要: 支持向量机(SVM)的分类决策过程涉及到对原始训练样本的学习,容易导致数据中隐私信息的泄漏。为解决上述问题,提出一种基于信息浓缩的隐私保护分类方法IC-SVM。该算法首先根据样本的邻域信息,通过模糊C均值(FCM)聚类算法进行聚类分析;接着,使用信息浓缩准则对聚类中心进行处理,得到浓缩点组成的新样本;最后,使用新样本进行训练并得到决策函数,并用它去进行分类测试,可以较好地保护数据的隐私。在UCI真实数据和PIE人脸数据上的实验结果表明,IC-SVM方法既能保护数据信息的安全,又有较高的分类准确率。

关键词: 支持向量机, 模糊C均值, 分类, 隐私保护, 信息浓缩

Abstract: The classificationn decision process of Support Vector Machine (SVM) involves the study of original training samples, which easily causes privacy disclosure. To solve this problem, a classification approach with privacy preservation called IC-SVM (Information Concentration Support Vector Machine) was proposed based on information concentration. Firstly, the original training data was concentrated using Fuzzy C-Means (FCM) clustering algorithm according to each sample point and its neighbors. Then clustering centers were reconstructed to get new samples through information concentration. Finally, the new samples were trained to get decision function, by which classification was done. The experimental results on UCI and PIE show that the proposed method achieves good classification accuracy as well as preventing privacy disclosure.

Key words: Support Vector Machine(SVM), Fuzzy C-Means(FCM), classification, privacy preservation, information concentration

中图分类号: