计算机应用 ›› 2015, Vol. 35 ›› Issue (3): 770-774.DOI: 10.11772/j.issn.1001-9081.2015.03.770

• 人工智能 • 上一篇    下一篇

鲁棒的特征权重自调节软子空间聚类算法

支晓斌1, 许朝晖2   

  1. 1. 西安邮电大学 理学院, 西安 710121;
    2. 西安邮电大学 通信与信息工程学院, 西安 710121
  • 收稿日期:2014-10-14 修回日期:2014-12-10 出版日期:2015-03-10 发布日期:2015-03-13
  • 通讯作者: 支晓斌
  • 作者简介:支晓斌(1976-),男,内蒙古巴彦淖尔人,副教授,博士,CCF会员,主要研究方向:模式识别;许朝晖(1988-),男,宁夏银川人,硕士研究生,主要研究方向:现代信号处理
  • 基金资助:

    国家自然科学基金资助项目(61340040,61102095);陕西省自然科学基金资助项目(2014JM8307);陕西省教育厅专项科研计划基金资助项目(14JK1661)

Robust soft subspace clustering algorithm with feature weight self-adjustment mechanism

ZHI Xiaobin1, XU Zhaohui2   

  1. 1. School of Science, Xi'an University of Posts and Telecommunications, Xi'an Shaanxi 710121, China;
    2. School of Telecommunication and Information Engineering, Xi'an University of Posts and Telecommunications, Xi'an Shaanxi 710121, China
  • Received:2014-10-14 Revised:2014-12-10 Online:2015-03-10 Published:2015-03-13

摘要:

针对已有的特征权重自调节软子空间(SC-FWSA)聚类算法存在对噪声敏感的问题,基于一种非欧氏距离,提出一种鲁棒的特征权重自调节软子空间(RSC-FWSA)聚类算法。RSC-FWSA在迭代过程中自适应地为数据生成一个权函数,通过计算每一类数据的加权平均来计算聚类中心,这种"加权平均"使得聚类中心的估计对噪声相对不敏感,从而可以提升算法对带噪声数据和复杂结构数据的聚类精度。人工数据和真实数据上的对比性实验,验证了RSC-FWSA算法的有效性。特别是人工带噪声数据和3个真实数据:Wine, Zoo以及Breastcancer上的实验结果表明,RSC-FWSA可以显著提升原对应算法的聚类精度。RSC-FWSA具有的强鲁棒性使得该算法适用于高维带噪声和复杂结构数据的聚类问题。

关键词: 特征加权, 软子空间聚类, 自调节机制, 鲁棒聚类, 非欧氏距离

Abstract:

In view of soft subspace clustering with feature weight self-adjustment mechanism (SC-FWSA) clustering algorithm sensitive to noise, based on a non-Euclidean distance, a robust soft subspace clustering with feature weighting self-adjustment mechanism (RSC-FWSA) was proposed. RSC-FWSA algorithm adaptively generated a weighting function for data during the iteration, and computed the clustering centers by computing the weighted average of each class. And this "weighted average" made the estimation of the cluster centers be relatively insensitive to noise, and improved the clustering accuracy of algorithm for data with noise and complex structure. The effectiveness of RSC-FWSA algorithm were demonstrated with comparative experiments on synthetic and real data. Especially the experimental results on synthetic data set with noise and 3 real data sets:Wine, Zoo and Breastcancer show that RSC-FWSA can significantly improve the clustering accuracy compared to original corresponding algorithm. RSC-FWSA has strong robustness, which makes it be suitable for the clustering of data with high dimensions, noise and complex structure.

Key words: feature weighting, soft subspace clustering, self-adjustment mechanism, robust clustering, non-Euclidean distance

中图分类号: