Journal of Computer Applications

• Database Technology • Previous Articles     Next Articles

Fast K-means algorithm based on influence factors

<a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>M<a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>i<a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>n<a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>g<a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>W<a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>e<a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>i<a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a> <a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>L<a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>e<a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>n<a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>g<a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a> <a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>X<a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>i<a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>a<a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>o<a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>y<a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>u<a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>n<a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a> <a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>C<a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>h<a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>e<a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>n<a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a> <a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=((([Author]) AND 1[Journal]) AND year[Order])" target="_blank"></a>   

  • Received:2007-03-02 Revised:2007-05-07 Online:2007-12-01 Published:2007-12-01
  • Contact: MingWei Leng

一种基于影响因子的快速K-均值算法

冷明伟 陈晓云 颜清   

  1. 上饶师范学院 兰州大学 上饶师范学院 数学与计算机系
  • 通讯作者: 冷明伟

Abstract: The running time of K-means overly depends on the initial points but the right value of k is unknown and selecting the initial points effectively is also difficult. To solve this problem, depending on the research about initialization deeply, a high effective approach used to select the initial points was presented, which ensured at least one point to be selected in each cluster. Influence factor between clusters was presented to measure the similarity of two clusters, and a new merging algorithm based on it was put forward. This algorithm and the initial points' selection algorithm can automatically and fast give the actual value of k and select the right initial points based on the dataset characters. Finally, Gaussian datasets were used to test the algorithm and a satisfying result was obtained.

Key words: clustering, k-means, initial point, influence factor

摘要: K-均值聚类算法的执行时间过度依赖于初始点的选取,但是在实际问题中并不知道k的取值和怎样才能有效地选取初始点。在对K-均值算法中初始点的选取进行深入研究的基础上,提出了一种有效的初始点选取算法。现存的类间相似度并不能很好地度量两个类的相似性,为此提出了一种新颖的度量方法:类间影响因子,使用类间影响因子对类进行合并。该方法和上面提出的初始点选取算法能够根据数据集本身的特性快速地自动选取初始中心并给出初始点的个数。最后用高斯数据集对算法进行测试,得到了一个令人满意的结果。

关键词: 聚类, k-均值, 初始点, 影响因子

CLC Number: