《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (4): 1162-1169.DOI: 10.11772/j.issn.1001-9081.2021071183
所属专题: CCF第36届中国计算机应用大会 (CCF NCCA 2021)
• CCF第36届中国计算机应用大会 (CCF NCCA 2021) • 上一篇 下一篇
收稿日期:
2021-07-08
修回日期:
2021-09-02
接受日期:
2021-09-03
发布日期:
2021-09-13
出版日期:
2022-04-10
通讯作者:
魏娜娜
作者简介:
陈晶(1976—),女,河北秦皇岛人,副教授,博士,CCF会员,主要研究方向:对等网络、Web服务、社交网络分析基金资助:
Jing CHEN1,2,3, Jiangchuan LIU1, Nana WEI1()
Received:
2021-07-08
Revised:
2021-09-02
Accepted:
2021-09-03
Online:
2021-09-13
Published:
2022-04-10
Contact:
Nana WEI
About author:
CHEN Jing, born in 1976, Ph. D., associate professor. Her research interests include peer-to-peer networks, Web service, social network analysis.Supported by:
摘要:
针对标签传播算法稳定性不足、准确性较差的问题,提出了融合K-shell和标签熵的标签传播重叠社区发现算法OCKELP。首先,采用K-shell算法减少了标签初始化时间,并利用标签熵的更新序列提高了算法的稳定性;其次,引入综合影响力进行标签选择,并将社区层次信息和节点局部信息融合提高了算法的准确性。在真实网络数据集上,OCKELP相较于重叠社区发现算法(COPRA)、基于多核心标签传播的重叠社区识别方法(OMKLP)、SLPA的模块度最大提升分别约68.64%、53.99%、42.29%,在人工网络数据集的归一化互信息(NMI)值上,OCKELP相较于其他三种算法也有着明显优势,且随着重叠节点隶属社区数量的增加可以挖掘出社区的真实结构。
中图分类号:
陈晶, 刘江川, 魏娜娜. 融合K-shell和标签熵的重叠社区发现算法[J]. 计算机应用, 2022, 42(4): 1162-1169.
Jing CHEN, Jiangchuan LIU, Nana WEI. Overlapping community detection algorithm combining K-shell and label entropy[J]. Journal of Computer Applications, 2022, 42(4): 1162-1169.
数据集 | 节点数 | 边数 | 描述 |
---|---|---|---|
Karate | 34 | 78 | 空手道俱乐部网络 |
Dolphins | 62 | 159 | 海豚社会网络 |
Football | 115 | 613 | 足球比赛网络 |
Polbooks | 105 | 441 | 美国政治之书网络 |
1 133 | 5 451 | 电子邮件交往网络 | |
Internet | 22 963 | 48 436 | 互联网快照网络 |
表1 真实网络数据集
Tab. 1 Real network datasets
数据集 | 节点数 | 边数 | 描述 |
---|---|---|---|
Karate | 34 | 78 | 空手道俱乐部网络 |
Dolphins | 62 | 159 | 海豚社会网络 |
Football | 115 | 613 | 足球比赛网络 |
Polbooks | 105 | 441 | 美国政治之书网络 |
1 133 | 5 451 | 电子邮件交往网络 | |
Internet | 22 963 | 48 436 | 互联网快照网络 |
参数 | 描述 | 参数 | 描述 |
---|---|---|---|
N | 网络节点数 | minc | 最小规模社区内节点数 |
k | 网络平均度数 | maxc | 最大规模社区内节点数 |
maxk | 网络最大度数 | on | 网络重叠节点数 |
mu | 网络内部链接强度系数 | om | 重叠节点隶属社区的数量 |
表2 LFR基准网络参数描述
Tab. 2 LFR benchmark network parameter description
参数 | 描述 | 参数 | 描述 |
---|---|---|---|
N | 网络节点数 | minc | 最小规模社区内节点数 |
k | 网络平均度数 | maxc | 最大规模社区内节点数 |
maxk | 网络最大度数 | on | 网络重叠节点数 |
mu | 网络内部链接强度系数 | om | 重叠节点隶属社区的数量 |
参数 | 网络 | |||
---|---|---|---|---|
R1 | R2 | R3 | R4 | |
N | 200 | 200 | 1 000 | 1 000 |
k | 10 | 10 | 10 | 10 |
maxk | 30 | 30 | 30 | 30 |
mu | 0.1 | 0.3 | 0.1~0.5 | 0.1~0.5 |
minc | 20 | 20 | 20 | 20 |
maxc | 50 | 50 | 50 | 50 |
on | 20 | 20 | 100 | 200 |
om | 2~6 | 2~6 | 4 | 4 |
表3 四个LFR基准网络参数
Tab. 3 Four LFR benchmark network parameters
参数 | 网络 | |||
---|---|---|---|---|
R1 | R2 | R3 | R4 | |
N | 200 | 200 | 1 000 | 1 000 |
k | 10 | 10 | 10 | 10 |
maxk | 30 | 30 | 30 | 30 |
mu | 0.1 | 0.3 | 0.1~0.5 | 0.1~0.5 |
minc | 20 | 20 | 20 | 20 |
maxc | 50 | 50 | 50 | 50 |
on | 20 | 20 | 100 | 200 |
om | 2~6 | 2~6 | 4 | 4 |
社区编号 | 节点编号 |
---|---|
1 | 1,2,3,4,5,6,7,8,11,12,13,14,17,18,20,22 |
2 | 9,10,15,16,19,21,23,24,25,26,27,28,29,30,31,32,33,34 |
表4 Karate数据集网络真实划分
Tab. 4 Real partition of Karate dataset network
社区编号 | 节点编号 |
---|---|
1 | 1,2,3,4,5,6,7,8,11,12,13,14,17,18,20,22 |
2 | 9,10,15,16,19,21,23,24,25,26,27,28,29,30,31,32,33,34 |
数据集 | OCKELP | OMKLP | COPRA | MNMF | NNSED | SLPA |
---|---|---|---|---|---|---|
Karate | 0.371 7 | 0.367 9 | 0.353 1 | 0.186 4 | 0.025 9 | 0.214 5 |
Dolphins | 0.513 3 | 0.519 1 | 0.439 4 | 0.391 5 | 0.018 4 | 0.481 4 |
Football | 0.569 5 | — | 0.511 6 | 0.603 0 | 0.277 9 | 0.535 3 |
Polbooks | 0.525 3 | 0.484 2 | 0.456 5 | 0.385 7 | 0.256 5 | 0.505 2 |
0.428 1 | 0.307 9 | 0.352 3 | 0.459 6 | 0.204 0 | 0.417 1 | |
Internet | 0.402 7 | 0.185 3 | 0.126 3 | 0.393 2 | — | 0.388 7 |
表5 不同算法的EQ值比较结果
Tab. 5 Comparison results of EQ values of different algorithms
数据集 | OCKELP | OMKLP | COPRA | MNMF | NNSED | SLPA |
---|---|---|---|---|---|---|
Karate | 0.371 7 | 0.367 9 | 0.353 1 | 0.186 4 | 0.025 9 | 0.214 5 |
Dolphins | 0.513 3 | 0.519 1 | 0.439 4 | 0.391 5 | 0.018 4 | 0.481 4 |
Football | 0.569 5 | — | 0.511 6 | 0.603 0 | 0.277 9 | 0.535 3 |
Polbooks | 0.525 3 | 0.484 2 | 0.456 5 | 0.385 7 | 0.256 5 | 0.505 2 |
0.428 1 | 0.307 9 | 0.352 3 | 0.459 6 | 0.204 0 | 0.417 1 | |
Internet | 0.402 7 | 0.185 3 | 0.126 3 | 0.393 2 | — | 0.388 7 |
1 | NEWMAN M E J, GIRVAN M. Finding and evaluating community structure in networks[J]. Physical Review, E: Statistical, Nonlinear, and Soft Matter Physics, 2004, 69(2 Pt 2): No.026113. 10.1103/physreve.69.026113 |
2 | GIRVAN M, NEWMAN M E J. Community structure in social and biological networks[J]. Proceedings of the National Academy of Sciences of the United States of America, 2002, 99(12): 7821-7826. 10.1073/pnas.122653799 |
3 | 杨立文. 基于改进的 GN 算法的社区发现技术[D]. 长春:吉林大学, 2012: 27-32. |
YANG L W. Community detecting technology based on improved GN algorithm[D]. Changchun: Jilin University, 2012: 27-32. | |
4 | WHITE S, SMYTH P. A spectral clustering approach to finding communities in graphs[C]// Proceedings of the 2005 SIAM International Conference on Data Mining. Philadelphia, PA: SIAM, 2005: 274-285. 10.1137/1.9781611972757.25 |
5 | YANG K H, XU Y S. An effective method for complex network community detection based on hierarchical splitting[C]// Proceedings of the 4th International Conference on Mathematics and Artificial Intelligence. New York: ACM, 2019: 10-14. 10.1145/3325730.3325747 |
6 | ZHU X, GHAHRAMANI Z. Learning from labels and unlabeled data with label propagation[R]. Pittsburgh, PA: Carnegie Mellon University, 2002. 10.1007/978-3-540-28649-3_29 |
7 | RAGHAVAN U N, ALBERT R, KUMARA S. Near linear time algorithm to detect community structures in large-scale networks[J]. Physical Review, E: Statistical, Nonlinear, and Soft Matter Physics, 2007, 76(3 Pt 2): No.036106. 10.1103/physreve.76.036106 |
8 | SUN H L, LIU J, HUANG J B, et al. CenLP: a centrality-based label propagation algorithm for community detection in networks[J]. Physica A: Statistical Mechanics and its Applications, 2015, 436: 767-780. 10.1016/j.physa.2015.05.080 |
9 | XIE J R, SZYMANSKI B K. LabelRank: a stabilized label propagation algorithm for community detection in networks[C]// Proceedings of the IEEE 2nd Network Science Workshop. Piscataway: IEEE, 2013: 138-143. 10.1109/nsw.2013.6609210 |
10 | BLONDEL V D, GUILLAUME J L, LAMBIOTTE R, et al. Fast unfolding of communities in large networks[J]. Journal of Statistical Mechanics: Theory and Experiment, 2008, 2008(10): No.P10008. 10.1088/1742-5468/2008/10/p10008 |
11 | 薛青. 改进的Louvain社团发现算法的研究及应用[D]. 太原:山西大学, 2020: 15-23. |
XUE Q. Improvement of Louvain community detection algorithm and its application[D]. Taiyuan: Shanxi University, 2020: 15-23. | |
12 | WALTMAN L, ECK N J VAN. A smart local moving algorithm for large-scale modularity-based community detection[J]. The European Physical Journal B, 2013, 86(11): No.471. 10.1140/epjb/e2013-40829-0 |
13 | PALLA G, DERÉNYI I, FARKAS I, et al. Uncovering the overlapping community structure of complex networks in nature and society[J]. Nature, 2005, 435(7043): 814-818. 10.1038/nature03607 |
14 | GREGORY S. Finding overlapping communities in networks by label propagation[J]. New Journal of Physics, 2010, 12(10): No.103018. 10.1088/1367-2630/12/10/103018 |
15 | WU Z H, LIN Y F, GREGORY S, et al. Balanced multi-label propagation for overlapping community detection in social networks[J]. Journal of Computer Science and Technology, 2012, 27(3): 468-479. 10.1007/s11390-012-1236-x |
16 | 沈海燕,李星毅. 一种新的基于标签传播的重叠社区发现算法[J]. 软件导刊, 2015, 14(4): 59-62. 10.11907/rjdk.1431038 |
SHEN H Y, LI X Y. A new algorithm for overlapping community discovery based on label propagation[J]. Software Guide, 2015, 14(4): 59-62. 10.11907/rjdk.1431038 | |
17 | 邓琨,李文平,余法红,等. 基于多核心标签传播的复杂网络重叠社区识别方法[J]. 通信学报, 2017, 38(2):53-66. 10.11959/j.issn.1000-436x.2017028 |
DENG K, LI W P, YU F H, et al. Overlapping community detection in complex networks based on multi kernel label propagation[J]. Journal on Communications, 2017, 38(2): 53-66. 10.11959/j.issn.1000-436x.2017028 | |
18 | LU M L, ZHANG Z L, QU Z H, et al. LPANNI: overlapping community detection using label propagation in large-scale complex networks[J]. IEEE Transactions on Knowledge and Data Engineering, 2019, 31(9): 1736-1749. 10.1109/tkde.2018.2866424 |
19 | CHENG F, WANG C T, ZHANG X Y, et al. A local-neighborhood information based overlapping community detection algorithm for large-scale complex networks[J]. IEEE/ACM Transactions on Networking, 2021, 29(2): 543-556. 10.1109/tnet.2020.3038756 |
20 | SHI X J, WANG Y, HUANG K K, et al. An overlapping community discovery algorithm based on label propagation[C]// Proceedings of the 2019 International Conference on Communications, Information System and Computer Engineering. Piscataway: IEEE, 2019: 395-398. 10.1109/cisce.2019.00093 |
21 | TANG M L, LIU Q, MA T H, et al. K-lowest-influence overlapping nodes based community detection in complex networks[J]. IEEE Access, 2019, 7: 109646-109661. 10.1109/access.2019.2930474 |
22 | XIE J R, SZYMANSKI B K, LIU X M. SLPA: uncovering overlapping communities in social networks via a speaker-listener interaction dynamic process[C]// Proceedings of the IEEE 11th International Conference on Data Mining Workshops. Piscataway: IEEE, 2011: 344-349. 10.1109/icdmw.2011.154 |
23 | KITSAK M, GALLOS L K, HAVLIN S, et al. Identification of influential spreaders in complex networks[J]. Nature Physics, 2010, 6(11): 888-893. 10.1038/nphys1746 |
24 | ZHAO Y X, LI S, CHEN X Z H. Community detection using label propagation in entropic order[C]// Proceedings of the 12th International Conference on Computer and Information Technology. Piscataway: IEEE, 2012: 18-24. 10.1109/cit.2012.30 |
25 | NICOSIA V, MANGIONI G, CARCHIOLO V, et al. Extending the definition of modularity to directed graphs with overlapping communities[J]. Journal of Statistical Mechanics: Theory and Experiment, 2009, 2009(3): No.P03024. 10.1088/1742-5468/2009/03/p03024 |
26 | SHEN H W, CHENG X Q, CAI K, et al. Detect overlapping and hierarchical community structure in networks[J]. Physica A: Statistical Mechanics and its Applications, 2009, 388(8): 1706-1712. 10.1016/j.physa.2008.12.021 |
27 | LANCICHINETTI A, FORTUNATO S, KERTÉSZ J. Detecting the overlapping and hierarchical community structure in complex networks[J]. New Journal of Physics, 2009, 11(3): No.033015. 10.1088/1367-2630/11/3/033015 |
28 | MCDAID A F, GREENE D, HURLEY N. Normalized mutual information to evaluate overlapping community finding algorithms[EB/OL]. (2013-08-02) [2021-08-09].. |
29 | LANCICHINETTI A, FORTUNATO S, RADICCHI F. Benchmark graphs for testing community detection algorithms[J]. Physical Review. E, Statistical, Nonlinear, and Soft Matter Physics, 2008, 78(4 Pt 2): No,046110. 10.1103/physreve.78.046110 |
30 | WANG X, CUI P, WANG J, et al. Community preserving network embedding[C]// Proceedings of the 31st AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2017: 203-209. 10.1609/aaai.v33i01.33015337 |
31 | SUN B J, SHEN H W, GAO J H, et al. A non-negative symmetric encoder-decoder approach for community detection[C]// Proceedings of the 2017 ACM Conference on Information and Knowledge Management. New York: ACM, 2017: 597-606. 10.1145/3132847.3132902 |
[1] | 刘世梁, 王义, 马应龙. 考虑社区规模不平衡的非重叠社区检测[J]. 《计算机应用》唯一官方网站, 2024, 44(11): 3396-3402. |
[2] | 李翔锟, 贾彩燕. 融合重叠社区正则化及隐式反馈的协同过滤方法[J]. 计算机应用, 2021, 41(1): 53-59. |
[3] | 吕亚丽, 苗钧重, 胡玮昕. 基于标签进行度量学习的图半监督学习算法[J]. 计算机应用, 2020, 40(12): 3430-3436. |
[4] | 吴清寿, 陈荣旺, 余文森, 刘耿耿. 融合标签预处理与节点影响力的重叠社区发现算法[J]. 计算机应用, 2020, 40(12): 3578-3585. |
[5] | 郑文萍, 岳香豆, 杨贵. 基于随机游走的改进标签传播算法[J]. 计算机应用, 2020, 40(12): 3423-3429. |
[6] | 成其伟, 陈启买, 贺超波, 刘海. 基于改进对称二值非负矩阵分解的重叠社区发现方法[J]. 计算机应用, 2020, 40(11): 3203-3210. |
[7] | 杜航原, 裴希亚, 王文剑. 面向属性网络的重叠社区发现算法[J]. 计算机应用, 2019, 39(11): 3151-3157. |
[8] | 顾军华, 霍士杰, 王守彬, 田喆. 基于节点中心性和社区相似性的快速标签传播算法[J]. 计算机应用, 2018, 38(5): 1320-1326. |
[9] | 汪焱, 黄发良, 元昌安. 基于标签影响力的半同步社区发现算法[J]. 计算机应用, 2016, 36(6): 1573-1578. |
[10] | 黄泳航, 汤庸, 李春英, 汤志康, 刘继伟. 基于社区划分的学术论文推荐模型[J]. 计算机应用, 2016, 36(5): 1279-1283. |
[11] | 李春英, 汤庸, 汤志康, 黄泳航, 袁成哲, 赵剑冬. 面向大规模学术社交网络的社区发现模型[J]. 计算机应用, 2015, 35(9): 2565-2568. |
[12] | 石梦雨, 周勇, 邢艳. 基于LeaderRank的标签传播社区发现算法[J]. 计算机应用, 2015, 35(2): 448-451. |
[13] | 胡丽莹, 郭躬德, 马昌凤. 基于对称非负矩阵分解的重叠社区发现方法[J]. 计算机应用, 2015, 35(10): 2742-2746. |
[14] | 石立新 张俊星. 基于势函数的标签传播社区发现算法[J]. 计算机应用, 2014, 34(3): 738-741. |
[15] | 吴钟 聂规划 陈冬林 章佩璐. 基于协同过滤的Web服务动态社区发现算[J]. 计算机应用, 2013, 33(08): 2095-2099. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||