计算机应用 ›› 2012, Vol. 32 ›› Issue (10): 2884-2887.DOI: 10.3724/SP.J.1087.2012.02884

• 人工智能 • 上一篇    下一篇

引入共享近邻加权图的Chameleon算法

薛文娟1,2,刘培玉1,2,刘栋1,2   

  1. 1. 山东省分布式计算机软件新技术重点实验室,济南 250014
    2. 山东师范大学 信息科学与工程学院,济南 250014
  • 收稿日期:2012-04-25 修回日期:2012-06-05 发布日期:2012-10-23 出版日期:2012-10-01
  • 通讯作者: 薛文娟
  • 作者简介:薛文娟(1986-),女,山东济宁人,硕士研究生,CCF会员,主要研究方向:网络信息安全、网络安全审计;刘培玉(1960-),男,山东潍坊人,教授,博士生导师,主要研究方向:计算机网络信息安全、网络系统规划、网络信息资源和软件;刘栋(1987-),男,山东泰安人,硕士研究生,CCF会员,主要研究方向:网络信息安全、网络安全审计。
  • 基金资助:
    国家自然基金;山东省高新自主创新专项工程(2008ZZ28);山东省自然科学基金资助项目(Y2008G34)

Improved Chameleon algorithm using weighted nearest neighbors graph

XUE Wen-juan1,2,LIU Pei-yu1,2,LIU Dong1,2   

  1. 1. School of Information Science and Engineering, Shandong Normal University, Jinan Shandong 250014,China
    2. Shandong Provincial Key Laboratory for Distributed Computer Software Novel Technology, Jinan Shandong 250014,China
  • Received:2012-04-25 Revised:2012-06-05 Online:2012-10-23 Published:2012-10-01
  • Contact: XUE Wen-juan

摘要: 针对Chameleon算法中采用距离函数度量数据点间的相似度,导致距离相近的两个点可能仅拥有很少的共同特征,最小二分实际操作困难,合并时需要人工指定阈值以及一旦合并完成后不能撤销的问题,对Chameleon算法进行改进,提出一种引入共享近邻加权图(WSnnG)的Chameleon算法。该算法以数据对象间的共享近邻数来衡量相似度,进一步构造WSnnG,再利用网络模块性评价函数指导最小二分,然后以结构等价相似度作为合并的依据,最后通过引入内聚度度量函数解决合并后不能撤销的问题。在UCI数据集及4个二维人造数据集上的实验结果表明,该算法在聚类精度和运行时间方面具有更好的效果。

关键词: 共享近邻加权图, 最小二分, 网络模块性评价函数, 结构等价相似度, 内聚度度量函数

Abstract: For the Chameleon algorithm using distance function to measure the similarity of data points, resulting in that the two proximate points may only have a few common characteristics, minimum half has practical difficulties, the merger needs artificial specified threshold value, and can not be revoked once the merger is completed. Therefore, the authors improved Chameleon algorithm and proposed a new Chameleon algorithm using Weighted Shared nearest neighbors Graph (WSnnG). Firstly, it measured the similarity by using the number of shared nearest neighbors, further constructed the WSnnG. Secondly, it resolved minimum half through the introduction of the network module evaluation function, then according to the structural equivalence similarity degree as a basis for merger. Finally, a new cohesion measure was discussed to solve problems that can not be revoked after the merger. The experimental results on UCI data sets and four two-dimensional artificial data sets show that the improved Chameleon algorithm using WSnnG has greatly improved in clustering accuracy and running time.

Key words: Weighted Shared nearest neighbors Graph (WSnnG), minimum half, network module evaluation function, structural equivalence similarity degree, cohesion measure

中图分类号: