Journal of Computer Applications ›› 2013, Vol. 33 ›› Issue (09): 2477-2481.DOI: 10.11772/j.issn.1001-9081.2013.09.2477
• Database technology • Previous Articles Next Articles
ZHANG Jianpeng1,JIN Xin1,CHEN Fucai1,CHEN Hongchang2,HOU Ying1
Received:
Revised:
Online:
Published:
Contact:
张建朋1,金鑫1,陈福才1,陈鸿昶2,候颖1
通讯作者:
作者简介:
基金资助:
国家863计划项目
Abstract: As to the low clustering quality and high communication cost of the existed distributed clustering algorithm, a distributed data stream clustering algorithm (DAPDC) which combined the density with the idea of representative points clustering was proposed. The concept of the class cluster representative point to describe the local distribution of data flows was introduced in the local sites using affinity propagation clustering, while the global site got the global model by merging the summary data structure that was uploaded from the local site by the improved density clustering algorithm. The simulation results show that DAPDC can improve the clustering quality of data streams in distributed environment significantly. Simultaneously, the algorithm can find the clusters of different shapes and reduce the amount of data transferred significantly by using class cluster representative points.
Key words: data mining, distributed clustering, data stream, affinity propagation, density clustering
摘要: 针对分布式数据流聚类算法存在的聚类质量不高、通信代价大的问题,提出了密度和代表点聚类思想相结合的分布式数据流聚类算法。该算法的局部站点采用近邻传播聚类,引入了类簇代表点的概念来描述局部分布的概要信息,全局站点采用基于改进的密度聚类算法合并局部站点上传的概要数据结构进而获得全局模型。仿真实验结果表明,所提算法能明显提高分布式环境下数据流的聚类质量,同时算法使用类簇代表点能够发现不同形状的聚簇并显著降低数据传输量。
关键词: 数据挖掘, 分布式聚类, 数据流, 近邻传播, 基于密度聚类
CLC Number:
TP181
ZHANG Jianpeng JIN Xin CHEN Fucai CHEN Hongchang HOU Ying. Distributed data stream clustering algorithm based on affinity propagation[J]. Journal of Computer Applications, 2013, 33(09): 2477-2481.
张建朋 金鑫 陈福才 陈鸿昶 候颖. 基于近邻传播的分布式数据流聚类算法[J]. 计算机应用, 2013, 33(09): 2477-2481.
0 / Recommend
Add to citation manager EndNote|Ris|BibTeX
URL: http://www.joca.cn/EN/10.11772/j.issn.1001-9081.2013.09.2477
http://www.joca.cn/EN/Y2013/V33/I09/2477