计算机应用 ›› 2015, Vol. 35 ›› Issue (12): 3511-3514.DOI: 10.11772/j.issn.1001-9081.2015.12.3511

• 人工智能 • 上一篇    下一篇

基于聚类分析的二分网络社区挖掘

张嫱嫱1, 黄廷磊2, 张银明1   

  1. 1. 桂林电子科技大学计算机科学与工程学院, 广西桂林 541004;
    2. 中国科学院电子学研究所, 北京 100190
  • 收稿日期:2015-06-12 修回日期:2015-09-10 出版日期:2015-12-10 发布日期:2015-12-10
  • 通讯作者: 黄廷磊(1971-),男,安徽肥东人,教授,博士,主要研究方向:数据挖掘、智能计算、无线Mesh网
  • 作者简介:张嫱嫱(1990-),女,河南周口人,硕士研究生,主要研究方向:数据挖掘、组织和可视化;张银明(1991-),男,河北邯郸人,硕士研究生,主要研究方向:数据挖掘、组织和可视化。
  • 基金资助:
    国家863计划项目(2012AA011005)。

Detecting community in bipartite network based on cluster analysis

ZHANG Qiangqiang1, HUANG Tinglei2, ZHANG Yinming1   

  1. 1. College of Computer Science and Engineering, Guilin University of Electronic Technology, Guilin Guangxi 541004, China;
    2. Institute of Electronics, Chinese Academy of Sciences, Beijing 100190, China
  • Received:2015-06-12 Revised:2015-09-10 Online:2015-12-10 Published:2015-12-10

摘要: 针对二分网络中社区挖掘的准确性不高、对额外参数的依赖较大的问题,基于谱聚类算法的思想,从二分网络的拓扑结构展开,提出了一种改进的社区挖掘算法。该算法将二分网络映射到单一网络进行社区挖掘,采用资源分布矩阵替代传统的邻接矩阵,挖掘出同类节点间的隐含信息,有效地保证了原图的信息,改进了谱聚类算法的输入,提高了社区挖掘的准确性;将模块度函数概念应用到聚类分析中,用模块度衡量社区挖掘的质量,有效解决了自动确定聚类数目的问题。在实际网络和人造网络上进行实验,与蚁群优化算法、边集聚系数算法等算法进行对比,实验结果表明,所提算法不但能较准确地获得二分网络的社区数目,且在不需要任何额外参数的情况下,能获得很好的划分效果,可以应用于深入理解二分网络,进行推荐、影响力分析等。

关键词: 二分网络, 资源矩阵, 谱聚类算法, 特征值, 特征向量, 模块度

Abstract: Concerning the problems of the low accuracy of community detection in bipartite network and the strong dependence on additional parameters, depending on the original network topology, based on the idea of spectral clustering algorithm, a new community algorithm was proposed. The proposed algorithm mined community by mapping a bipartite network to a single network, substituted resource distribution matrix for traditional similarity matrix, effectively guaranteed the information of the original network, improved the input of spectral clustering algorithm and the accuracy of community detection. The modularity function was applied to clustering analysis, and the modularity was used to measure the quality of community mining, effectively solved the problem of automatically determining the clustering number. The experimental results on the actual network and artificial network show that, compared with ant colony optimization algorithm, edge clustering coefficient algorithm etc., the proposed algorithm can not only accurately identify the number of the communities of the bipartite network, but also obtain higher quality of community partitioning without previously known parameters. The proposed algorithm can be applied to the deep understanding of bipartite network, such as recommendation and influence analysis.

Key words: bipartite network, resource matrix, spectral clustering algorithm, eigenvalue, eigenvector, modularity

中图分类号: