Journal of Computer Applications ›› 2014, Vol. 34 ›› Issue (8): 2279-2284.DOI: 10.11772/j.issn.1001-9081.2014.08.2279

• Artificial intelligence • Previous Articles     Next Articles

High-dimensional data clustering algorithm with subspace optimization

WU Tao,CHEN Lifei,GUO Gongde   

  1. School of Mathematics and Computer Science, Fujian Normal University, Fuzhou Fujian 350007, China
  • Received:2014-01-06 Revised:2014-04-04 Online:2014-08-01 Published:2014-08-10
  • Contact: WU Tao

优化子空间的高维聚类算法

吴涛,陈黎飞,郭躬德   

  1. 福建师范大学 数学与计算机科学学院,福州350007
  • 通讯作者: 吴涛
  • 作者简介:吴涛(1990-),男,福建龙岩人,硕士研究生,主要研究方向:数据挖掘;陈黎飞(1972-),男,福建长乐人,副教授,博士,主要研究方向:数据挖掘、机器学习;郭躬德(1965-),男,福建龙岩人,教授,博士,主要研究方向:人工智能、数据挖掘、机器学习。
  • 基金资助:

    国家自然科学基金资助项目;深圳市基础研究(重点)项目

Abstract:

A new soft subspace clustering algorithm was proposed to address the optimization problem for the projected subspaces, which was generally not considered in most of the existing soft subspace clustering algorithms. Maximizing the deviation of feature weights was proposed as the sub-space optimization goal, and a quantitative formula was presented. Based on the above, a new optimization objective function was designed which aimed at minimizing the within-cluster compactness while optimizing the soft subspace associated with each cluster. A new expression for feature-weight computation was mathematically derived, with which the new clustering algorithm was defined based on the framework of the classical k-means. The experimental results show that the proposed method significantly reduces the probability of trapping in local optimum prematurely and improves the stability of clustering results. And it has good performance and clustering efficiency, which is suitable for high-dimensional data cluster analysis.

摘要:

针对当前大多数典型软子空间聚类算法未能考虑簇类投影子空间的优化问题,提出一种新的软子空间聚类算法。该算法将最大化权重之间的差异性作为子空间优化的目标,并提出了一个量化公式。以此为基础设计了一个新的优化目标函数,在最小化簇内紧凑度的同时,优化每个簇所在的软子空间。通过数学推导得到了新的特征权重计算方法,并基于k-means算法框架定义了新聚类算法。实验结果表明,所提算法对子空间的优化降低了算法过早陷入局部最优的可能性,提高了算法的稳定性,并且具有良好的性能和聚类效果,适合用于高维数据聚类分析。

CLC Number: