K-means算法最佳聚类数确定方法

计算机应用 ›› 2010, Vol. 30 ›› Issue (8): 1995-1998.

K-means算法最佳聚类数确定方法

周世兵¹,徐振源¹,唐旭清²

1. 江南大学
2.

收稿日期:2010-02-23 修回日期:2010-03-21 发布日期:2010-07-30 出版日期:2010-08-01
通讯作者: 周世兵
基金资助:
国家863计划项目;基于数据驱动的故障诊断方法及其应用研究

Method for determining optimal number of clusters in K-means clustering algorithm

Received:2010-02-23 Revised:2010-03-21 Online:2010-07-30 Published:2010-08-01
Contact: Shi-Bing Zhou

摘要/Abstract

摘要： K-means聚类算法是以确定的类数k为前提对数据集进行聚类的，通常聚类数事先无法确定。从样本几何结构的角度设计了一种新的聚类有效性指标，在此基础上提出了一种新的确定K-means算法最佳聚类数的方法。理论研究和实验结果验证了以上算法方案的有效性和良好性能。

关键词: K-means聚类, 聚类数, 聚类有效性指标, 聚类分析

Abstract: K-means clustering algorithm clusters datasets according to the certain clustering number k. However,k cannot be confirmed beforehand. A new clustering validity index was designed from the standpoint of sample geometry. Based on the index, a new method for determining the optimal clustering number in K-means clustering algorithm was proposed. Theoretical research and experimental results demonstrate the validity and good performance of the above-mentioned algorithm.

Key words: K-means clustering, number of clusters, clustering validity index, cluster analysis

周世兵徐振源唐旭清. K-means算法最佳聚类数确定方法[J]. 计算机应用, 2010, 30(8): 1995-1998.

[1]	戴嫣然, 戴国庆, 袁玉波. 基于肤色学习的多人脸前景抽取方法[J]. 计算机应用, 2021, 41(6): 1659-1666.
[2]	郭佳, 韩李涛, 孙宪龙, 周丽娟. 自动确定聚类中心的比较密度峰值聚类算法[J]. 计算机应用, 2021, 41(3): 738-744.
[3]	陈港, 孟相如, 康巧燕, 阳勇. 基于拓扑分割与聚类分析的虚拟软件定义网络映射算法[J]. 《计算机应用》唯一官方网站, 2021, 41(11): 3309-3318.
[4]	任杰, 闵帆, 汪敏. 基于最远总距离采样的代价敏感主动学习[J]. 计算机应用, 2019, 39(9): 2499-2504.
[5]	任帅, 徐振超, 王震, 贺媛, 张弢, 苏东旭, 慕德俊. 基于多融合态的低密度三维模型信息隐藏算法[J]. 计算机应用, 2019, 39(4): 1100-1105.
[6]	孙石磊, 王超, 赵元棣. 基于轮廓系数的参数无关空中交通轨迹聚类方法[J]. 计算机应用, 2019, 39(11): 3293-3297.
[7]	陆明炽, 王守华, 李云柯, 纪元法, 孙希延, 邓桂辉. 基于特征匹配和距离加权的蓝牙定位算法[J]. 计算机应用, 2018, 38(8): 2359-2364.
[8]	任帅, 张弢, 徐振超, 王震, 贺媛, 柳雨农. 特征点标注与聚类的三维模型信息隐藏算法[J]. 计算机应用, 2018, 38(4): 1017-1022.
[9]	李晔, 陈奕延, 张淑芬. 基于密度峰值的混合型数据聚类算法设计[J]. 计算机应用, 2018, 38(2): 483-490.
[10]	徐晓伟, 杜一, 周园春. 基于多源出行数据的居民行为模式分析方法[J]. 计算机应用, 2017, 37(8): 2362-2367.
[11]	梁双, 周丽华, 杨培忠. 基于聚类分析分库策略的社交网络数据库查询性能与数据迁移[J]. 计算机应用, 2017, 37(3): 673-679.
[12]	金亮, 于炯, 杨兴耀, 鲁亮, 王跃飞, 国冰磊, 廖彬. 基于聚类层次模型的视频推荐算法[J]. 计算机应用, 2017, 37(10): 2828-2833.
[13]	谢洪安, 李栋, 苏旸, 杨凯. 基于聚类分析的可信网络管理模型[J]. 计算机应用, 2016, 36(9): 2447-2451.
[14]	杨辉华, 王克, 李灵巧, 魏文, 何胜韬. 基于自适应布谷鸟搜索算法的K-means聚类算法及其应用[J]. 计算机应用, 2016, 36(8): 2066-2070.
[15]	王智文, 蒋联源, 王宇航, 王日凤, 张灿龙, 黄镇谨, 王鹏涛. 基于尺度自适应局部时空特征的足球比赛视频中的多运动员行为表示[J]. 计算机应用, 2016, 36(8): 2134-2138.