Journal of Computer Applications ›› 2016, Vol. 36 ›› Issue (8): 2066-2070.DOI: 10.11772/j.issn.1001-9081.2016.08.2066

Previous Articles     Next Articles

K-means clustering algorithm based on adaptive cuckoo search and its application

YANG Huihua1,2, WANG Ke1, LI Lingqiao1, WEI Wen1, HE Shengtao3   

  1. 1. Guangxi Experiment Center of Information Science, Guilin University of Electronic Technology, Guilin Guangxi 541004, China;
    2. Automation School, Beijing University of Posts and Telecommunications, Beijing 100876, China;
    3. Guilin Intelligent Metric Information Technology Company Limited, Guilin Guangxi 541004, China
  • Received:2016-03-01 Revised:2016-05-16 Online:2016-08-10 Published:2016-08-10
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (21365008, 61562013), the Guangxi Natural Science Foundation (2013GXNSFBA019279), the Innovation Project of Graduate Education of Guilin University of Electronic Technology (GDYCSZ201478, GDYCSZ201474).

基于自适应布谷鸟搜索算法的K-means聚类算法及其应用

杨辉华1,2, 王克1, 李灵巧1, 魏文1, 何胜韬3   

  1. 1. 桂林电子科技大学 广西信息科学实验中心, 广西 桂林 541004;
    2. 北京邮电大学 自动化学院, 北京 100876;
    3. 桂林市智度信息科技有限公司, 广西 桂林 541004
  • 通讯作者: 杨辉华
  • 作者简介:杨辉华(1972-),男,湖南常德人,教授,博士生导师,博士,主要研究方向:机器学习、人工智能、最优化;王克(1988-),男,安徽淮北人,硕士研究生,主要研究方向:数据挖掘、分布式计算;李灵巧(1986-),男,四川达州人,博士研究生,主要研究方向:智能信息处理、机器学习;魏文(1989-),男,山东日照人,硕士研究生,主要研究方向:模式识别、机器学习;何胜韬(1985-),男,广西玉林人,硕士,主要研究方向:软件工程。
  • 基金资助:
    国家自然科学基金资助项目(21365008,61562013);广西自然科学基金资助项目(2013GXNSFBA019279);桂林电子科技大学研究生创新项目(GDYCSZ201478,GDYCSZ201474)。

Abstract: The original K-means clustering algorithm is seriously affected by initial centroids of clustering and easy to fall into local optima. To solve this problem, an improved K-means clustering algorithm based on Adaptive Cuckoo Search (ACS), namely ACS-K-means, was proposed, in which the search step of cuckoo was adjusted adaptively so as to improve the quality of solution and boost speed of convergence. The performance of ACS-K-means clustering was firstly evaluated on UCI dataset, and the results demonstrated that it surpassed K-means, GA-K-means (K-means based on Genetic Algorithm), CS-K-means (K-means based on Cuckoo Search) and PSO-K-means (K-means based on Particle Swarm Optimization) in clustering quality and convergence rate. Finally, the ACS-K-means clustering algorithm was applied to the development of heat map of urban management cases of Qingxiu district of Nanning city, the results also showed that the proposed method had better quality of clustering and faster speed of convergence.

Key words: data mining, K-means clustering, Cuckoo Search (CS)algorithm, digital urban management, heat map

摘要: 针对原始K-means聚类算法受初始聚类中心影响过大以及容易陷入局部最优的不足,提出一种基于改进布谷鸟搜索(CS)的K-means聚类算法(ACS-K-means)。其中,自适应CS(ACS)算法在标准CS算法的基础上引入步长自适应调整,以提高搜索精度和收敛速度。在UCI标准数据集上,ACS-K-means算法可得到比K-means、基于遗传算法的K-means(GA-K-means)、基于布谷鸟搜索的K-means(CS-K-means)和基于粒子群优化的K-means(PSO-K-means)算法更优的聚类质量和更高的收敛速度。将ACS-K-means聚类算法应用到南宁市青秀区“城管通”系统的城管案件热图的开发中,在地图上对案件地理坐标进行聚类并显示,应用结果表明,聚类效果良好,算法收敛速度快。

关键词: 数据挖掘, K-means聚类, 布谷鸟搜索算法, 数字城管, 热图

CLC Number: