计算机应用 ›› 2011, Vol. 31 ›› Issue (02): 441-445.

• 数据库与数据挖掘 • 上一篇    下一篇

基于co-occurrence相似度的聚类集成方法

凌光,王明春,冯嘉毅   

  1. 天津职业技术师范大学
  • 收稿日期:2010-07-19 修回日期:2010-09-09 发布日期:2011-02-01 出版日期:2011-02-01
  • 通讯作者: 凌光
  • 基金资助:
    基于图形处理器的高性能计算;天津市自然科学基金

Clustering ensemble method based on co-occurrence similarity

  • Received:2010-07-19 Revised:2010-09-09 Online:2011-02-01 Published:2011-02-01
  • Contact: LING Guang

摘要: 首先提出了一种基于属性值的co-occurrence相似度概念,通过对其进一步的研究,提出了3个等价性表述;然后对属性值之间的co-occurrence相似度进行引申,给出了数据对象之间co-occurrence相似度的定义,并将其成功应用到聚类集成方法中。利用co-occurrence相似度在计算某个初始聚类结果中数据对象之间的相似度时,充分考虑了其他初始聚类结果和该初始聚类结果之间的相互影响和联系。实验表明, 基于co-occurrence相似度的聚类集成(CSCE)方法能有效识别数据之间的细微结构,有助于提高聚类集成的效果。

关键词: 聚类集成, binary相似度, co-occurrence相似度, 基于簇相似的划分算法, 基于co-occurrence相似度的聚类集成

Abstract: Firstly, a strict mathematical definition of co-occurrence similarity between categorical attribute values was given. Secondly, three other equivalent definitions were proposed. Then, the definition of the co-occurrence similarity between attribute values was extended to calculate the co-occurrence similarity for data objects, and was applied in clustering ensemble successfully. Using the co-occurrence similarity between data objects, the individual similarity matrix of an initial clustering result can be calculated by taking other initial clustering results into account. The experimental results show that Co-occurrence Similarity based on Clustering Ensemble (CSCE) method can effectively identify the subtle structures in data, and improve the accuracy of clustering ensemble greatly.

Key words: clustering ensemble, binary similarity, co-occurrence similarity, Cluster-based Similarity Partitioning Algorithm (CSPA), Co-occurrence Similarity based on Clustering Ensemble (CSCE)