计算机应用 ›› 2015, Vol. 35 ›› Issue (10): 2742-2746.DOI: 10.11772/j.issn.1001-9081.2015.10.2742

• 第十五届中国机器学习会议(CCML2015)论文 • 上一篇    下一篇

基于对称非负矩阵分解的重叠社区发现方法

胡丽莹, 郭躬德, 马昌凤   

  1. 福建师范大学 数学与计算机科学学院, 福州 350007
  • 收稿日期:2015-06-17 修回日期:2015-07-07 出版日期:2015-10-10 发布日期:2015-10-14
  • 通讯作者: 胡丽莹(1979-),女,浙江缙云人,讲师,博士研究生,主要研究方向:社区发现、最优化理论与算法,hlyxyz@fjnu.edu.cn
  • 作者简介:郭躬德(1965-),男,福建龙岩人,教授,博士生导师,博士,CCF会员,主要研究方向:人工智能、机器学习、数据挖掘、文本分类;马昌凤(1962-),男,湖南隆回人,教授,博士生导师,博士,主要研究方向:最优化理论与算法、数值代数、偏微分方程数值解、变分不等式与互补问题的数值方法。
  • 基金资助:
    国家自然科学基金资助项目(61175123);福建省教育厅(A类)项目(JA15139);福州市科技计划项目(2014-G-80);福建省青年创新项目(2014J05002)。

Overlapping community discovery method based on symmetric nonnegative matrix factorization

HU Liying, GUO Gongde, MA Changfeng   

  1. School of Mathematics and Computer Science, Fujian Normal University, Fuzhou Fujian 350007, China
  • Received:2015-06-17 Revised:2015-07-07 Online:2015-10-10 Published:2015-10-14

摘要: 针对重叠社区中的重要节点(重叠节点、中心节点、离群节点)及其固有的重叠社区结构的发现问题,提出了一种新的对称非负矩阵分解算法。首先将误差逼近项和非对称惩罚项的和作为目标函数,然后基于梯度更新的原则及非负约束条件推导出该算法。对5个实际网络进行了仿真实验,结果显示所提算法能将实际网络的重要节点及其固有的社区结构发现出来。从社区发现结果的平均导电率和算法的执行时间看,所提方法优于非负矩阵分解社区发现(CDNMF)方法;从准确率和召回率的调和平均值的加权平均值看,所提方法比较适合较大数据集的重叠社区发现。

关键词: 复杂网络, 重叠社区, 社区发现, 对称非负矩阵分解, 邻接矩阵

Abstract: In view of the important nodes (including overlapping nodes, central nodes and outlier nodes) in overlapping community and the inherent overlapping community structure discovery problem, a new symmetric nonnegative matrix factorization algorithm was proposed. First, the sum of the error approximation and the asymmetric penalty term was used as the objective function. Then the algorithm was derived by using the principle of gradient update and the nonnegative constraint conditions. Simulation experiments were carried out on five real networks. The results show that the proposed algorithm can find the important nodes of the actual networks and their inherent community structures. The average conductance and the algorithm's execution time of the community discovery results are better than those of Community Detection with Nonnegative Matrix Factorization (CDNMF) method;the weighted average of the accuracy and recall rate's harmonic mean value shows that the proposed method is more suitable for the large databases.

Key words: complex network, overlapping community, community discovery, symmetric nonnegative matrix factorization, adjacency matrix

中图分类号: