计算机应用 ›› 2011, Vol. 31 ›› Issue (02): 432-434.

• 数据库与数据挖掘 • 上一篇    下一篇

基于密度的改进K均值算法及实现

傅德胜,周辰   

  1. 南京信息工程大学
  • 收稿日期:2010-06-23 修回日期:2010-09-07 发布日期:2011-02-01 出版日期:2011-02-01
  • 通讯作者: 傅德胜

Improved K-means algorithm and its implementation based on density

FU De-Sheng ,   

  • Received:2010-06-23 Revised:2010-09-07 Online:2011-02-01 Published:2011-02-01
  • Contact: FU De-Sheng

摘要: 传统的K均值算法的初始聚类中心从数据集中随机产生,聚类结果很不稳定。提出一种基于密度算法优化初始聚类中心的改进K-means算法,该算法选择相互距离最远的k个处于高密度区域的点作为初始聚类中心。实验证明,改进的K-means算法能够消除对初始聚类中心的依赖,聚类结果有了较大的改进。

关键词: 聚类, K均值算法, 初始聚类中心, 高密度区域

Abstract: The initial clustering center of the traditional K-means algorithm was generated randomly from the data set, and the clustering result was unstable. An improved K-means algorithm based on density algorithm optimizing initial clustering center was proposed, which selected the furthest mutual distance k points in highdensity region as the initial centers. The experimental results demonstrate that the improved K-means algorithm can eliminate the dependence on the initial cluster center, and the clustering result has been greatly improved.

Key words: clustering, K-means algorithm, initial clustering center, high-density area