《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (S2): 105-110.DOI: 10.11772/j.issn.1001-9081.2023030393

• 数据科学与技术 • 上一篇    

基于标签传播与多指标的重叠社区检测算法

王明月1,2, 邹晓红1,2, 陈晶3(), 许成伟1,2   

  1. 1.燕山大学 信息科学与工程学院, 河北 秦皇岛 066004
    2.河北省计算机虚拟技术与系统集成重点实验室(燕山大学), 河北 秦皇岛 066004
    3.广东海洋大学 数学与计算机学院, 广东 湛江 524088
  • 收稿日期:2023-04-10 修回日期:2023-06-22 接受日期:2023-07-04 发布日期:2024-01-09 出版日期:2023-12-31
  • 通讯作者: 陈晶
  • 作者简介:王明月(1996—),女,河北秦皇岛人,硕士,主要研究方向:社区发现
    邹晓红(1967—),女,吉林吉林人,教授,博士,主要研究方向:图挖掘、社会网络
    陈晶(1976—),女,河北秦皇岛人,教授,博士,CCF会员,主要研究方向:对等网络、Web服务
    许成伟(1997—),男,黑龙江齐齐哈尔人,硕士,主要研究方向:社交网络。
  • 基金资助:
    国家自然科学基金资助项目(62172352);中央政府引导地方科技发展基金资助项目(ZD2019004);河北省创新能力提升计划项目(22567626H)

Overlapping community detection algorithm based on label propagation and multiple metrics

Mingyue WANG1,2, Xiaohong ZOU1,2, Jing CHEN3(), Chengwei XU1,2   

  1. 1.College of Information Science and Engineering,Yanshan University,Qinhuangdao Hebei 066004,China
    2.Key Laboratory of Computer Virtual Technology and System Integration of Hebei Province (Yanshan University),Qinhuangdao Hebei 066004,China
    3.School of Mathematics and Computer Science,Guangdong Ocean University,Zhanjiang Guangdong 524088,China
  • Received:2023-04-10 Revised:2023-06-22 Accepted:2023-07-04 Online:2024-01-09 Published:2023-12-31
  • Contact: Jing CHEN

摘要:

为解决标签传播的社区检测算法容易产生怪物社区和不稳定社区划分的问题,以标签熵为基础,提出一种重叠社区检测算法LEKA(Label Entropy and K-shell Algorithm in overlapping community),综合考虑了标签初始化、标签更新和标签传播的各个阶段。首先,利用K-shell算法对节点进行初始化以获取节点的层次信息;其次,依据标签熵升序依次更新节点标签,在选择标签时综合节点间的层次信息和节点间的影响,在存在多个候选标签的情况下基于节点标签权重进行选取。在真实网络数据集上的实验结果表明,LEKA在运行时间较短的情况下,重叠模块度EQ(ExtendQ)相较于OCKELP(Overlapping Community detection algorithm based on K-shell and label Entropy in Label Propagation)提高了2.3%~13.2%,具有较高的准确性和稳定性,更适合挖掘网络中的重叠社区结构。

关键词: 重叠社区检测, 标签熵, 节点影响力, 标签权重, 标签传播

Abstract:

In order to solve the problem that the community detection algorithm of label propagation is easy to produce monster communities and instable community division, an overlapping community discovery algorithm LEKA (Label Entropy and K-shell Algorithm in overlapping community) based on label entropy was proposed, in which all stages of label initialization, label update and label propagation were considered. Firstly, K-shell algorithm was used to initialize nodes and obtain the hierarchical information of nodes. Secondly, node labels were updated according to the ascending order of label entropy. When selecting labels, the hierarchical information and influence between nodes were integrated. When there were multiple candidate labels, the labels were selected based on the weights of node labels. Experimental results on real network datasets show that LEKA performs well in situations where the runtime is short, compared to OCKELP (Overlapping Community detection algorithm based on K-shell and label Entropy in Label Propagation), the overlapping module degree EQ (ExtendQ) is improved by 2.3% to 13.2%. LEKA has higher accuracy and stability and is more suitable for mining overlapping community structures in networks.

Key words: overlapping community detection, label entropy, node influence, label weight, label propagation

中图分类号: