基于佳点集与Leader方法的改进K-means聚类算法

doi:10.3724/SP.J.1087.2011.01359

计算机应用 ›› 2011, Vol. 31 ›› Issue (05): 1359-1362.DOI: 10.3724/SP.J.1087.2011.01359

基于佳点集与Leader方法的改进K-means聚类算法

张燕平,张娟,何成刚,褚维翠,张利娜

安徽大学计算机科学与技术学院,合肥 230039

收稿日期:2010-09-02 修回日期:2010-10-23 发布日期:2011-05-01 出版日期:2011-05-01
通讯作者: 张娟
作者简介:张燕平(1962-),女,安徽巢湖人,教授,博士生导师,主要研究方向:机器学习、人工智能、复杂网络、神经网络;张娟(1985-),女,安徽合肥人,硕士研究生,主要研究方向:机器学习、神经网络;何成刚(1984-),男,河南信阳人,硕士研究生,主要研究方向:机器学习;褚维翠(1986-),女,安徽合肥人,硕士研究生,主要研究方向:信息检索;张利娜(1984-),女,安徽淮北人,硕士研究生,主要研究方向:机器学习、神经网络。
基金资助:
国家自然科学基金资助项目(60675031);国家973计划项目(2007BC311003)。

Modified K-means clustering algorithm based on good point set and Leader method

ZHANG Yan-ping, ZHANG Juan, HE Cheng-gang, CHU Wei-cui, ZHANG Li-na

School of Computer Science and Technology, Anhui University, Hefei Anhui 230039, China

Received:2010-09-02 Revised:2010-10-23 Online:2011-05-01 Published:2011-05-01

摘要/Abstract

摘要： 针对传统K-means算法对初始点敏感的问题,采用数论中的佳点集理论结合Leader方法对K-means聚类算法加以改进,启发式地生成样本初始中心。根据两者不同的结合方式,所提算法分别称为KLG和KGL。佳点集理论能够产生比随机选取点更好的点,Leader方法则能反映数据对象本身的分布特性。结合佳点集理论和Leader方法各自的优点,能获得优化的初始中心。在UCI数据集上的实验表明,KLG算法和KGL算法所得到的结果均好于传统的和其他一些初始化的K-means算法。

关键词: K-means算法, 佳点集, Leader方法

Abstract: Traditional K-means algorithm is sensitive to the initial start center. To solve this problem, a method was proposed to optimize the initial center points through adopting the theory of good point set and Leader method. According to the different combination ways, the new algorithms were called KLG and KGL respectively. Better points could be obtained by the theory of good point set rather than random selection. The Leader method could reflect the distribution characteristics of the data object. The experimental results conducted on the UCI database show that the KLG and KGL algorithms significantly outperform the traditional and other initialization K-means algorithms.

Key words: K-means algorithm, good point set, Leader method

张燕平,张娟,何成刚,褚维翠,张利娜. 基于佳点集与Leader方法的改进K-means聚类算法[J]. 计算机应用, 2011, 31(05): 1359-1362.

ZHANG Yan-ping ZHANG Juan HE Cheng-gang CHU Wei-cui ZHANG Li-na. Modified K-means clustering algorithm based on good point set and Leader method[J]. Journal of Computer Applications, 2011, 31(05): 1359-1362.

[1]	郝芃斐, 池瑞, 屈志坚, 涂宏斌, 池学鑫, 张地友. 求解铁路物流配送中心选址问题的改进灰狼优化算法[J]. 计算机应用, 2021, 41(10): 2905-2911.
[2]	王行甫, 陈静, 王琳. 基于适应性动态步长的变异果蝇优化算法[J]. 计算机应用, 2016, 36(7): 1870-1874.
[3]	龙文, 赵东泉, 徐松金. 求解约束优化问题的改进灰狼优化算法[J]. 计算机应用, 2015, 35(9): 2590-2595.
[4]	吴洁璇, 陈振杰, 张云倩, 骈宇哲, 周琛. 多核CPU下的K-means遥感影像分类并行方法[J]. 计算机应用, 2015, 35(5): 1296-1301.
[5]	龙文陈乐. 求解约束化工优化问题的混合布谷鸟搜索算法[J]. 计算机应用, 2014, 34(2): 523-527.
[6]	洪留荣. 无需设定阈值的图像边缘检测[J]. 计算机应用, 2013, 33(08): 2330-2333.
[7]	李妮欧阳艾嘉李肯立. 求解约束优化的改进粒子群算法[J]. 计算机应用, 2012, 32(12): 3319-3321.
[8]	郑丹王潜平. K-means初始聚类中心的选择算法[J]. 计算机应用, 2012, 32(08): 2186-2192.
[9]	张宜浩金澎孙锐. 基于改进k-means算法的中文词义归纳[J]. 计算机应用, 2012, 32(05): 1332-1334.
[10]	梁昔明陈富龙文. 基于动态随机搜索和佳点集构造的改进粒子群优化算法[J]. 计算机应用, 2011, 31(10): 2796-2799.
[11]	范黎林王娟. 基于粗糙集的混合属性数据聚类算法[J]. 计算机应用, 2010, 30(12): 3377-3379.
[12]	李建国胡学钢. 高效的混合聚类算法及其在异常检测中的应用[J]. 计算机应用, 2010, 30(07): 1916-1918.
[13]	白如珍田青徐海江. 一种新的基于分水岭变换的聚类分析算法[J]. 计算机应用, 2008, 28(12): 3240-3243.
[14]	孙秀娟刘希玉. 基于新聚类有效性函数的改进K-means算法[J]. 计算机应用, 2008, 28(12): 3244-3247.
[15]	尉景辉;何丕廉;孙越恒. 基于KMeans的文本层次聚类算法研究[J]. 计算机应用, 2005, 25(10): 2323-2324.

基于佳点集与Leader方法的改进K-means聚类算法

Modified K-means clustering algorithm based on good point set and Leader method

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics