计算机应用 ›› 2010, Vol. 30 ›› Issue (8): 1995-1998.

• 人工智能 • 上一篇    下一篇

K-means算法最佳聚类数确定方法

周世兵1,徐振源1,唐旭清2   

  1. 1. 江南大学
    2.
  • 收稿日期:2010-02-23 修回日期:2010-03-21 发布日期:2010-07-30 出版日期:2010-08-01
  • 通讯作者: 周世兵
  • 基金资助:
    国家863计划项目;基于数据驱动的故障诊断方法及其应用研究

Method for determining optimal number of clusters in K-means clustering algorithm

  • Received:2010-02-23 Revised:2010-03-21 Online:2010-07-30 Published:2010-08-01
  • Contact: Shi-Bing Zhou

摘要: K-means聚类算法是以确定的类数k为前提对数据集进行聚类的,通常聚类数事先无法确定。从样本几何结构的角度设计了一种新的聚类有效性指标,在此基础上提出了一种新的确定K-means算法最佳聚类数的方法。理论研究和实验结果验证了以上算法方案的有效性和良好性能。

关键词: K-means聚类, 聚类数, 聚类有效性指标, 聚类分析

Abstract: K-means clustering algorithm clusters datasets according to the certain clustering number k. However,k cannot be confirmed beforehand. A new clustering validity index was designed from the standpoint of sample geometry. Based on the index, a new method for determining the optimal clustering number in K-means clustering algorithm was proposed. Theoretical research and experimental results demonstrate the validity and good performance of the above-mentioned algorithm.

Key words: K-means clustering, number of clusters, clustering validity index, cluster analysis