计算机应用 ›› 2014, Vol. 34 ›› Issue (8): 2166-2169.DOI: 10.11772/j.issn.1001-9081.2014.08.2166

• 第五届中国数据挖掘会议(CCDM 2014)论文 • 上一篇    下一篇

新的模糊聚类有效性指标

郑宏亮,徐本强,赵晓慧,邹丽   

  1. 辽宁师范大学 计算机与信息技术学院,辽宁 大连116081
  • 收稿日期:2014-05-08 修回日期:2014-05-20 出版日期:2014-08-01 发布日期:2014-08-10
  • 通讯作者: 郑宏亮
  • 作者简介:郑宏亮(1970-),男,辽宁铁岭人,讲师,硕士,主要研究方向:人工智能、数据挖掘;徐本强(1978-),男,黑龙江双城人,讲师,硕士,主要研究方向:人工智能;赵晓慧(1987-),女,辽宁大连人,硕士研究生,主要研究方向:数据挖掘;邹丽(1971-),女,辽宁大连人,副教授,博士,CCF会员,主要研究方向:智能信息处理。
  • 基金资助:

    国家自然科学基金资助项目

Novel validity index for fuzzy clustering

ZHENG Hongliang,XU Benqiang,ZHAO Xiaohui,ZOU Li   

  1. School of Computer and Information Technology, Liaoning Normal University, Dalian Liaoning 116081, China
  • Received:2014-05-08 Revised:2014-05-20 Online:2014-08-01 Published:2014-08-10
  • Contact: ZHENG Hongliang

摘要:

在经典的模糊C均值(FCM)算法中,聚类数需要预先给出,否则算法无法工作,这在一定程度上限制了FCM算法的应用范围。针对FCM算法中聚类数需要预先设定问题,提出了一种新的模糊聚类有效性指标。首先,通过运行FCM算法得到隶属度矩阵;然后,通过隶属度矩阵计算类内紧密性和类间重叠性;最后,利用类内的紧密性和类间的重叠性定义了一个新的聚类有效性指标。该指标克服了FCM算法中类数需要预先设定的缺点,利用该指标可以发现最符合数据自然分布的类的数目。通过对人工数据集和实际数据集的测试表明,对于模糊因子取1.8,2.0和2.2三个不同的常用值,均能发现最优聚类数。

Abstract:

It is necessary to pre-define a cluster number in classical Fuzzy C-means (FCM) algorithm. Otherwise, FCM algorithm can not work normally, which limits the applications of this algorithm. Aiming at the problem of pre-assigning cluster number for FCM algorithm, a new fuzzy cluster validity index was presented. Firstly, the membership matrix was got by running the FCM algorithm. Secondly, the intra class compactness and the inter class overlap were computed by the membership matrix. Finally, a new cluster validity index was defined by using the intra class compactness and the inter class overlap. The proposal overcomes the shortcomings of FCM that the cluster number must be pre-assigned. The optimal cluster number can be effectively found by the proposed index. The experimental results on artificial and real data sets show the validity of the proposed index. It also can be seen that the optimal cluster number are obtained for three different fuzzy factor values of 1.8, 2.0 and 2.2 which are general used in FCM algorithm.

中图分类号: