Journal of Computer Applications ›› 2016, Vol. 36 ›› Issue (1): 150-153.DOI: 10.11772/j.issn.1001-9081.2016.01.0150

Previous Articles     Next Articles

Fuzzy clustering algorithm based on midpoint density function

ZHOU Yueyue, HU Jie, SU Tao   

  1. School of Computer and Information Engineering, Hubei University, Wuhan Hubei 430062, China
  • Received:2015-07-01 Revised:2015-09-03 Online:2016-01-09 Published:2016-01-10
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (61202100).


周跃跃, 胡婕, 苏涛   

  1. 湖北大学 计算机与信息工程学院, 武汉 430062
  • 通讯作者: 胡婕(1977-),女,湖北汉川人,副教授,博士,主要研究方向:数据库及其推理技术、语义数据库、复杂数据管理
  • 作者简介:周跃跃(1991-),男,陕西兴平人,硕士研究生,主要研究方向:数据挖掘、数据库;苏涛(1990-),男,湖北洪湖人,硕士研究生,主要研究方:计算机图像处理、视觉计算。
  • 基金资助:

Abstract: In the traditional Fuzzy C-Means (FCM) clustering algorithm, the initial clustering center is uncertain and the number of clusters should be preset in advance which may lead to inaccurate results. The fuzzy clustering algorithm based on midpoint density function was put forward. Firstly, the stepwise regression thought was integrated as the initial clustering center selection method to avoid convergence from local circulation, and then the number of clusters was determined, finally according to the results, the validity index of fuzzy clustering including overlap degree and resolution was judged to determin the optimal number of clusters. The results prove that, compared with the traditional improved FCM, the proposed algorithm reduces the number of iterations and increases the average accuracy by 12%. The experimental results show that the proposed algorithm can reduce the processing time of clustering, and it is better than the comparison algorithm on the average accuracy and the clustering performance index.

Key words: Fuzzy C-Means (FCM), midpoint method, class set density function method, stepwise regression thought, validity index

摘要: 针对传统模糊C-均值(FCM)聚类算法初始聚类中心不确定,且需要人为预先设定聚类类别数,从而导致结果不准确的问题,提出了一种基于中点密度函数的模糊聚类算法。首先,结合逐步回归思想作为初始聚类中心选取的方法,避免收敛结果陷入局部循环;其次,确定可能的聚类类别数目;最后,对结果进行重叠度和分离度的模糊聚类有效性指标判定,确定最佳的聚类类别数。实验证明该算法与原改进C-均值聚类算法相比,减少了迭代次数,平均准确率提高了12%。实验结果表明该算法能够减少聚类的处理时间,并在平均准确率和聚类性能指标上优于对比算法。

关键词: 模糊C-均值, 中点法, 类集密度函数法, 逐步回归思想, 有效性指标

CLC Number: