密度敏感的数据竞争聚类算法

doi:10.11772/j.issn.1001-9081.2015.02.0444

计算机应用 ›› 2015, Vol. 35 ›› Issue (2): 444-447.DOI: 10.11772/j.issn.1001-9081.2015.02.0444

密度敏感的数据竞争聚类算法

苏辉^1,2, 葛洪伟^1,2, 张欢庆¹, 袁运浩¹

1. 江南大学物联网工程学院, 江苏无锡 214122;
2. 轻工过程先进控制教育部重点实验室(江南大学), 江苏无锡 214122

收稿日期:2014-09-16 修回日期:2014-10-19 发布日期:2015-02-12 出版日期:2015-02-10
通讯作者: 葛洪伟
作者简介:苏辉(1991-),男,安徽蚌埠人,硕士研究生,CCF会员,主要研究方向:人工智能、模式识别; 葛洪伟(1967-),男,江苏无锡人,教授,博士生导师,博士,主要研究方向:人工智能、模式识别、图像处理; 张欢庆(1982-),男,河南商丘人,博士研究生,主要研究方向:信息融合、目标跟踪; 袁运浩(1983-),男,江苏徐州人,副教授,博士,主要研究方向:模式识别、机器学习、信息融合、图像处理。
基金资助:
国家自然科学基金资助项目(61402203,61305017);江苏省普通高校研究生科研创新计划项目(KYLX_1122);江苏高校优势学科建设工程资助项目。

Density-sensitive clustering by data competition algorithm

SU Hui^1,2, GE Hongwei^1,2, ZHANG Huanqing¹, YUAN Yunhao¹

1. School of Internet of Things, Jiangnan University, Wuxi Jiangsu 214122, China;
2. Key Laboratory of Advanced Process Control for Light Industry, Ministry of Education (Jiangnan University), Wuxi Jiangsu 214122, China

Received:2014-09-16 Revised:2014-10-19 Online:2015-02-12 Published:2015-02-10

摘要/Abstract

摘要：

针对数据竞争聚类算法在处理复杂结构数据集时聚类性能不佳的问题,提出了一种密度敏感的数据竞争聚类算法。首先,在密度敏感距离测度的基础上定义了局部距离,以描述数据分布的局部一致性;其次,在局部距离的基础上计算出数据间的全局距离,用来描述数据分布的全局一致性,挖掘数据的空间分布信息,以弥补欧氏距离描述数据分布全局一致性能力不佳的缺陷;最后,将全局距离用于数据竞争聚类算法中。将新算法与基于欧氏距离的数据竞争聚类算法进行性能比较,在人工数据集和真实数据集上的实验结果表明,该算法克服了数据竞争聚类算法难以处理复杂结构数据的缺点,聚类结果具有更高的准确率。

关键词: 聚类, 数据竞争, 密度敏感, 距离测度, 聚合场

Abstract:

Since the clustering by data competition algorithm has poor performance on complex datasets, a density-sensitive clustering by data competition algorithm was proposed. Firstly, the local distance was defined based on density-sensitive distance measure to describe the local consistency of data distribution. Secondly, the global distance was calculated based on local distance to describe the global consistency of data distribution and dig the information of data space distribution, which can make up for the defect of Euclidean distance on describing the global consistency of data distribution. Finally, the global distance was used in clustering by data competition algorithm. Using synthetic and real life datasets, the comparison experiments were conducted on the proposed algorithm and the original clustering by data competition based on Euclidean distance. The simulation results show that the proposed algorithm can obtain better performance in clustering accuracy rate and overcome the defect that clustering by data competition algorithm is difficult to handle complex datasets.

Key words: clustering, data competition, density-sensitive, distance measure, aggregation field

中图分类号:

苏辉, 葛洪伟, 张欢庆, 袁运浩. 密度敏感的数据竞争聚类算法[J]. 计算机应用, 2015, 35(2): 444-447.

SU Hui, GE Hongwei, ZHANG Huanqing, YUAN Yunhao. Density-sensitive clustering by data competition algorithm[J]. Journal of Computer Applications, 2015, 35(2): 444-447.

参考文献

[1] JAIN A K, MURTY M N, FLYNN P J. Data clustering: a review [J].ACM Computing Surveys, 1999, 31(3): 264-323.
[2] MacQUEEN J. Some methods for classification and analysis of multivariate observations [C]//Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability. Berkeley: University of California Press, 1967, 1: 281-297.
[3] FERNANDEZ A, GOMEZ S. Solving non-uniqueness in agglomerative hierarchical clustering using multidendrograms [J]. Journal of Classification, 2008, 25(1): 43-65.
[4] FILIPPONE M, CAMASTRA F, MASULLI F, et al. A survey of kernel and spectral methods for clustering [J]. Pattern Recognition, 2008, 41(1): 176-190.
[5] FREY B J, DUECK D. Clustering by passing messages between data points [J]. Science, 2007, 315(5814): 972-976.
[6] KHAN S S, AHMAD A. Cluster center initialization algorithm for K-means clustering [J]. Pattern Recognition Letters, 2004, 25(11): 1293-1302.
[7] TAN P-N, STEINBACH M, KUMAR V. Introduction to data mining [M]. Boston: Addison-Wesley, 2005: 515.
[8] SHI J, MALIK J. Normalized cuts and image segmentation [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000, 22(8): 888-905.
[9] KIDDLE S J, WINDRAM O P F, McHATTIE S, et al. Temporal clustering by affinity propagation reveals transcriptional modules in arabidopsis thaliana [J]. Bioinformatics, 2010, 26(3): 355-362.
[10] LU Z, ZHANG Q. Clustering by data competition [J]. Science China: Information Sciences, 2013, 56(1): 1-13.
[11] LU Z, FAN D, CHEN B, et al. A data competition based clustering algorithm for large image segmentation[J]. Sciencd China: Information Sciences, 2012, 42(9): 1147-1157. (卢志茂, 范冬梅, 陈炳才, 等.一种基于数据竞争的高分辨率图像的聚类分割算法[J].中国科学:信息科学, 2012, 42(9): 1147-1157.)
[12] YANG P, ZHU Q, HUANG B. Spectral clustering with density sensitive similarity function [J]. Knowledge-Based Systems, 2011, 24(5): 621-628.
[13] WANG L, BO L, JIAO L. Density-sensitive spectral clustering [J]. Acta Electronica Sinica, 2007, 35(8): 1577-1581.(王玲, 薄列峰, 焦李成.密度敏感的谱聚类[J].电子学报, 2007, 35(8): 1577-1581.)
[14] PRADHAN A, MAHINTHAKUMAR G. Finding all-pairs shortest path for a large-scale transportation network using parallel floyd-warshall and parallel Dijkstra algorithms [J]. Journal of Computing in Civil Engineering, 2012,27(3): 263-273.
[15] WITTEN I H, FRANK E, HALL M A. Data mining: practical machine learning tools and techniques [M]. 3rd ed. San Fransisco: Morgan Kaufmann Publishers, 2011: 175.

密度敏感的数据竞争聚类算法

Density-sensitive clustering by data competition algorithm

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

[1]	李顺勇, 李师毅, 胥瑞, 赵兴旺. 基于自注意力融合的不完整多视图聚类算法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2696-2703.
[2]	王清, 赵杰煜, 叶绪伦, 王弄潇. 统一框架的增强深度子空间聚类方法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 1995-2003.
[3]	董瑶, 付怡雪, 董永峰, 史进, 陈晨. 不完整多视图聚类综述[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1673-1682.
[4]	蒋小霞, 黄瑞章, 白瑞娜, 任丽娜, 陈艳平. 基于事件表示和对比学习的深度事件聚类方法[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1734-1742.
[5]	黄天宇, 李远兴, 陈昊, 郭紫佳, 魏明军. 地空协同场景下加权模糊聚类用户簇划分方法[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1555-1561.
[6]	高麟, 周宇, 邝得互. 进化双层自适应局部特征选择[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1408-1414.
[7]	徐童童, 解滨, 张春昊, 张喜梅. 融合转移概率矩阵的多阶最近邻图聚类算法[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1527-1538.
[8]	丁雨, 张瀚霖, 罗荣, 孟华. 基于信念子簇切割的模糊聚类算法[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1128-1138.
[9]	孙林, 刘梦含. 基于自适应布谷鸟优化特征选择的K-means聚类[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 831-841.
[10]	张卓, 陈花竹. 基于一致性和多样性的多尺度自表示学习的深度子空间聚类[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 353-359.
[11]	杨成昊, 胡节, 王红军, 彭博. 基于注意力机制的不完备多视图聚类算法[J]. 《计算机应用》唯一官方网站, 2024, 44(12): 3784-3789.
[12]	朱云华, 孔兵, 周丽华, 陈红梅, 包崇明. 图对比学习引导的多视图聚类网络[J]. 《计算机应用》唯一官方网站, 2024, 44(10): 3267-3274.
[13]	尹春勇, 周永成. 双端聚类的自动调整聚类联邦学习[J]. 《计算机应用》唯一官方网站, 2024, 44(10): 3011-3020.
[14]	徐雪冉, 杨庚, 黄喻先. 横向联邦学习中差分隐私聚类算法[J]. 《计算机应用》唯一官方网站, 2024, 44(1): 217-222.
[15]	郭茂祖, 张雅喆, 赵玲玲. 基于空间语义和个体活动的电动汽车充电站选址方法[J]. 《计算机应用》唯一官方网站, 2023, 43(9): 2819-2827.