Journal of Computer Applications ›› 2017, Vol. 37 ›› Issue (11): 3080-3084.

### Grid clustering algorithm based on density peaks

YANG Jie1,2, WANG Guoyin1, WANG Fei1

1. 1. Chongqing Key Laboratory of Computational Intelligence(Chongqing University of Posts and Telecommunications), Chongqing 400065, China;
2. School of Physics and Electronics, Zunyi Normal University, Zunyi Guizhou 563002, China
• Received:2017-05-16 Revised:2017-06-14 Online:2017-11-10 Published:2017-11-11
• Supported by:
This work is partially supported by the National Natural Science Foundation of China (61572091), the Chongqing Postgraduate Scientific Research and Innovation Project (CYB16106), the High-end Talent Project (RC2016005), the Key Discipline Project of Guizhou Province (QXWB[2013]18).

### 基于密度峰值的网格聚类算法

1. 1. 计算智能重庆市重点实验室(重庆邮电大学), 重庆 400065;
2. 遵义师范学院 物理与电子科学学院, 贵州 遵义 563002
• 通讯作者: 王国胤
• 作者简介:杨洁(1987-),男,贵州遵义人,博士研究生,主要研究方向:粒计算、粗糙集、数据挖掘;王国胤(1970-),男,重庆人,教授,博士,CCF会员,主要研究方向:粒计算、软计算、认知计算;王飞(1989-),男,河南开封人,硕士研究生,主要研究方向:数据挖掘、粒计算。
• 基金资助:
国家自然科学基金资助项目（61572091）；重庆市研究生科研创新项目（CYB16106）；高端人才项目（RC2016005）；贵州省级重点学科（黔学位办[2013]18号）。

Abstract: The Density Peak Clustering (DPC) algorithm which required few parameters and no iteration was proposed in 2014, it was simple and novel. In this paper, a grid clustering algorithm which could efficiently deal with large-scale data was proposed based on DPC. Firstly, the N dimensional space was divided into disjoint rectangular units, and the unit space information was counted. Then the central cells of space was found based on DPC, namely, the central cells were surrounded by other grid cells of low local density, and the distance with grid cells of high local density was relatively large. Finally, the grid cells adjacent to their central cells were merged to obtain the clustering results. The experimental results on UCI artificial data set show that the proposed algorithm can quickly find the clustering centers, and effectively deal with the clustering problem of large-scale data, which has a higher efficiency compared with the original density peak clustering algorithm on different data sets, reducing the loss of time 10 to 100 times, and maintaining the loss of accuracy at 5% to 8%.

CLC Number: