计算机应用 ›› 2015, Vol. 35 ›› Issue (2): 444-447.DOI: 10.11772/j.issn.1001-9081.2015.02.0444

• 人工智能 • 上一篇    下一篇

密度敏感的数据竞争聚类算法

苏辉1,2, 葛洪伟1,2, 张欢庆1, 袁运浩1   

  1. 1. 江南大学 物联网工程学院, 江苏 无锡 214122;
    2. 轻工过程先进控制教育部重点实验室(江南大学), 江苏 无锡 214122
  • 收稿日期:2014-09-16 修回日期:2014-10-19 出版日期:2015-02-10 发布日期:2015-02-12
  • 通讯作者: 葛洪伟
  • 作者简介:苏辉(1991-),男,安徽蚌埠人,硕士研究生,CCF会员,主要研究方向:人工智能、模式识别; 葛洪伟(1967-),男,江苏无锡人,教授,博士生导师,博士,主要研究方向:人工智能、模式识别、图像处理; 张欢庆(1982-),男,河南商丘人,博士研究生,主要研究方向:信息融合、目标跟踪; 袁运浩(1983-),男,江苏徐州人,副教授,博士,主要研究方向:模式识别、机器学习、信息融合、图像处理。
  • 基金资助:

    国家自然科学基金资助项目(61402203,61305017);江苏省普通高校研究生科研创新计划项目(KYLX_1122);江苏高校优势学科建设工程资助项目。

Density-sensitive clustering by data competition algorithm

SU Hui1,2, GE Hongwei1,2, ZHANG Huanqing1, YUAN Yunhao1   

  1. 1. School of Internet of Things, Jiangnan University, Wuxi Jiangsu 214122, China;
    2. Key Laboratory of Advanced Process Control for Light Industry, Ministry of Education (Jiangnan University), Wuxi Jiangsu 214122, China
  • Received:2014-09-16 Revised:2014-10-19 Online:2015-02-10 Published:2015-02-12

摘要:

针对数据竞争聚类算法在处理复杂结构数据集时聚类性能不佳的问题,提出了一种密度敏感的数据竞争聚类算法。首先,在密度敏感距离测度的基础上定义了局部距离,以描述数据分布的局部一致性;其次,在局部距离的基础上计算出数据间的全局距离,用来描述数据分布的全局一致性,挖掘数据的空间分布信息,以弥补欧氏距离描述数据分布全局一致性能力不佳的缺陷;最后,将全局距离用于数据竞争聚类算法中。将新算法与基于欧氏距离的数据竞争聚类算法进行性能比较,在人工数据集和真实数据集上的实验结果表明,该算法克服了数据竞争聚类算法难以处理复杂结构数据的缺点,聚类结果具有更高的准确率。

关键词: 聚类, 数据竞争, 密度敏感, 距离测度, 聚合场

Abstract:

Since the clustering by data competition algorithm has poor performance on complex datasets, a density-sensitive clustering by data competition algorithm was proposed. Firstly, the local distance was defined based on density-sensitive distance measure to describe the local consistency of data distribution. Secondly, the global distance was calculated based on local distance to describe the global consistency of data distribution and dig the information of data space distribution, which can make up for the defect of Euclidean distance on describing the global consistency of data distribution. Finally, the global distance was used in clustering by data competition algorithm. Using synthetic and real life datasets, the comparison experiments were conducted on the proposed algorithm and the original clustering by data competition based on Euclidean distance. The simulation results show that the proposed algorithm can obtain better performance in clustering accuracy rate and overcome the defect that clustering by data competition algorithm is difficult to handle complex datasets.

Key words: clustering, data competition, density-sensitive, distance measure, aggregation field

中图分类号: