Overlapping community detection algorithm fusing label preprocessing and node influence

doi:10.11772/j.issn.1001-9081.2020060942

Journal of Computer Applications ›› 2020, Vol. 40 ›› Issue (12): 3578-3585.DOI: 10.11772/j.issn.1001-9081.2020060942

• Network and communications • Previous Articles Next Articles

Overlapping community detection algorithm fusing label preprocessing and node influence

WU Qingshou^1,2, CHEN Rongwang¹, YU Wensen^1,2, LIU Genggeng³

1. School of Mathematics and Computer Science, Wuyi University, Wuyishan Fujian 354300, China;
2. Key Laboratory of Cognitive Computing and Intelligent Information Processing of Fujian Education Institutions(Wuyi University), Wuyishan Fujian 354300, China;
3. College of Mathematics and Computer Science, Fuzhou University, Fuzhou Fujian 350116, China

Received:2020-05-31 Revised:2020-07-29 Online:2020-08-11 Published:2020-12-10
Supported by:
This work is partially supported by the National Natural Science Foundation of China （61877010）， the Natural Science Foundation of Fujian Province （2019J01835）.

融合标签预处理与节点影响力的重叠社区发现算法

吴清寿^1,2, 陈荣旺¹, 余文森^1,2, 刘耿耿³

1. 武夷学院数学与计算机学院, 福建武夷山 354300;
2. 认知计算与智能信息处理福建省高校重点实验室(武夷学院), 福建武夷山 354300;
3. 福州大学数学与计算机科学学院, 福州 350116

通讯作者: 余文森(1973-),男,福建南平人,教授,博士,CCF会员,主要研究方向:机器学习、图像处理。45509111@qq.com
作者简介:吴清寿(1977-),男,福建莆田人,副教授,硕士,CCF会员,主要研究方向:机器学习、复杂网络分析;陈荣旺(1970-),男,福建南平人,高级工程师,硕士,主要研究方向:机器学习、复杂网络分析;刘耿耿(1988-),男,福建泉州人,副教授,博士生导师,博士,CCF会员,主要研究方向:智能计算
基金资助:
国家自然科学基金资助项目（61877010）；福建省自然科学基金资助项目（2019J01835）。

Abstract

Abstract: Aiming at the problem of scattered initial labels and large randomness of label propagation, an overlapping community detection algorithm fusing label preprocessing and node influence was proposed. Firstly, the influence value of each node was calculated, and the node with the largest influence value was selected as the central node gradually. Secondly, the label of the central node was used to preprocess the labels of the homogeneous neighbor nodes, so as to reduce the number of initial labels as well as the randomness of subsequent label propagation, and preliminarily identify the overlapping nodes. Thirdly, the overlapping nodes were identified by the label belonging coefficient, and the labels of non-overlapping nodes were selected by the node influence values, improving the stability and accuracy of the proposed algorithm. Finally, in order to maximize the increment of the adaptive function, the communities with weak cohesion were merged together to improve the quality of communities. The simulation experimental results show that the proposed algorithm has the largest extended modularity value on 50% datasets of the six real networks, and has the best performance in Normalized Mutual Information (NMI) index on the artificial benchmark networks with different mixing degrees, overlapping degrees of node and the maximum numbers of communities to which the node belongs. In conclusion, the algorithm has good adaptability to all kinds of networks, and has nearly linear time complexity.

Key words: overlapping community, central node, label propagation, node influence, label belonging coefficient

摘要： 针对节点初始标签散乱及标签传播随机性大的问题，提出一种融合标签预处理与节点影响力的重叠社区发现算法。首先，计算节点影响力，逐步选择影响力值最大的节点作为中心节点；然后，用中心节点的标签对同质的邻居节点进行标签预处理，减少了初始标签数量，降低了后续标签传播的随机性，并初步识别出了重叠节点；其次，通过标签隶属系数识别重叠节点，用节点影响力值选择非重叠节点标签，提高了算法的稳定性和准确性；最后，以最大化自适应函数增量为目标，对内聚度弱的社区进行合并，提高了社区质量。仿真实验结果表明：对于六个真实网络，所提算法在50%的数据集上具有最大的扩展模块度值；而在不同混合度、节点重叠度和节点最大归属社区数的人工基准网络上，该算法在标准化互信息（NMI）指标上都具有最好的性能。综上所述，该算法对各类网络都具有较好的适应性，且具有接近线性的时间复杂度。

关键词: 重叠社区, 中心节点, 标签传播, 节点影响力, 标签隶属系数

CLC Number:

TP301.6

WU Qingshou, CHEN Rongwang, YU Wensen, LIU Genggeng. Overlapping community detection algorithm fusing label preprocessing and node influence[J]. Journal of Computer Applications, 2020, 40(12): 3578-3585.

吴清寿, 陈荣旺, 余文森, 刘耿耿. 融合标签预处理与节点影响力的重叠社区发现算法[J]. 计算机应用, 2020, 40(12): 3578-3585.

References

[1] GIRVAN M,NEWMAN M E J. Community structure in social and biologicalnetworks[J]. Proceedings of the National Academy of Sciences of the United States of America,2002,99(12):7821-7826.
[2] HARENBERG S,BELLO G,GJELTEMA L,et al. Community detection in large-scalenetworks:a survey and empirical evaluation[J]. Wiley Interdisciplinary Reviews Computational Statistics,2015,6(6):426-439.
[3] NAN D,YU W,LIU X,et al. A framework ofcommunity detection based on individual labels in attributenetworks[J]. Physica A:Statistical Mechanics and its Applications,2018,512:523-536.
[4] PALLA G, DERÉNYI I, FARKAS I, et al. Uncovering the overlappingcommunity structure ofcomplexnetworks in nature and society[J]. Nature,2005,435(7043):814-818.
[5] ZHANG X,WANG C,SU Y,et al. A fast overlappingcommunity detection algorithm based on weak cliques for large-scalenetworks[J]. IEEE Transactions on Computational Social Systems,2017,4(4):218-230.
[6] ZAREI M, IZADI D, SAMANI K A. Detecting overlappingcommunity structure ofnetworks based on vertex-vertex correlations[J]. Journal of Statistical Mechanics Theory and Experiment,2009(11):Article No. P11013.
[7] CAO X, WANG X, JIN D, et al. Identifying overlappingcommunities as well as hubs and outliers via nonnegative matrix factorization[J]. Scientific Reports,2013,3:Article No. 2993.
[8] ZHANG H, KING I, LYU M R. Incorporating implicit link preference into overlappingcommunity detection[C]//Proceedings of the 29th AAAI Conference on Artificial Intelligence. Palo Alto:AAAI Press,2015:396-402.
[9] 胡丽莹, 郭躬德, 马昌凤. 基于对称非负矩阵分解的重叠社区发现算法[J]. 计算机应用, 2015, 35(10):2742-2746.(HU L Y, GUO G D,MA C F. Overlappingcommunity discovery method based on symmetric nonnegative matrix factorization[J]. Journal of Computer Applications,2015,35(10):2742-2746.)
[10] SHAHMORADI M R,EBRAHIMI M,HESHMATI Z,et al. Multilayer overlappingcommunity detection using multi-objective optimization[J]. Future Generation Computer Systems,2019, 101:221-235.
[11] LANCICHINETTI A,FORTUNATO S,KERTÉSZ J. Detecting the overlapping and hierarchicalcommunity structure incomplexnetworks[J]. New Journal of Physics,2009,11:Article No. 033015.
[12] 杜航原, 裴希亚, 王文剑. 面向属性网络的重叠社区发现算法[J]. 计算机应用, 2019, 39(11):3151-3157.(DU H Y,PEI X Y,WANG W J. Overlappingcommunity detection algorithm for attributednetworks[J]. Journal of Computer Applications,2019, 39(11):3151-3157.)
[13] RAGHAVAN U N,ALBERT R,KUMARA S. Near linear time algorithm to discovercommunity structures in large-scalenetworks[J]. Physical Review E:Statistical,Nonlinear,and Soft Matter Physics,2007,76(3 Pt 2):036106.
[14] COSCIA M,ROSSETTI G,GIANNOTTI F,et al. Uncovering hierarchical and overlappingcommunities with a local-first approach[J]. ACM Transactions on Knowledge Discovery from Data,2014,9(1):Article No. 6.
[15] GREGORY S. Finding overlappingcommunities innetworks by label propagation[J]. New Journal of Physics,2010,12:Article No. 103018.
[16] XIE J, SZYMANSKI B K. Towards linear time overlappingcommunity detection in socialnetworks[C]//Proceedings of the 16th Pacific-Asia Conference on Knowledge Discovery and Data Mining,LNCS 7302. Berlin:Springer,2012:25-36.
[17] 马健, 刘峰, 李红辉, 等. 采用PageRank和节点聚类系数的标签传播重叠社区发现算法[J]. 国防科技大学学报, 2019, 41(1):183-190.(MA J,LIU F,LI H H,et al. Overlappingcommunity detection algorithm by label propagation using PageRank and node clustering coefficients[J]. Journal of National University of Defense Technology,2019,41(1):183-190.)
[18] BRIN S,PAGE L. The anatomy of a large-scale hypertextual web search engine[J]. Computer Networks and ISDN System,1998, 30(1/2/3/4/5/6/7):107-117.
[19] TANG J,QU M,WANG M,et al. LINE:large-scale informationnetwork embedding[C]//Proceedings of the 24th International Conference on World Wide Web. New York:ACM,2015:1067-1077.
[20] RADICCHI F,CASTELLANO C,CECCONI F,et al. Defining and identifyingcommunities innetworks[J]. Proceedings of the National Academy of Sciences of the United States of America, 2004,101(9):2658-2663.
[21] 刘世超, 朱福喜, 甘琳. 基于标签传播概率的重叠社区发现算法[J]. 计算机学报, 2016, 39(4):716-729.(LIU S C,ZHU F X, GAN L. A label-propagation-probability-based algorithm for overlappingcommunity detection[J]. Chinese Journal of Computers,2016,39(4):716-729.)
[22] 任晓龙, 吕琳媛. 网络重要节点排序方法综述[J]. 科学通报, 2014, 59(13):1175-1197.(REN X L,LYU L Y. Review of ranking nodes incomplexnetworks[J]. Chinese Science Bulletin,2014,59(13):1175-1197.)
[23] DANON L,DÍAZ-GUILERA A,DUCH J,et al. Comparingcommunity structure identification[J]. Journal of Statistical Mechanics:Theory and Experiment,2005,2005(9):Article No. P09008.
[24] SHEN H,CHENG X,CAI K,et al. Detect overlapping and hierarchicalcommunity structure innetworks[J]. Physica A:Statistical Mechanics and Its Applications,2009,388(8):1706-1712.
[25] 朱牧, 孟凡荣, 周勇. 基于链接密度聚类的重叠社区发现算法[J]. 计算机研究与发展, 2013, 50(12):2520-2530.(ZHU M, MENG F R,ZHOU Y. Density-based link clustering algorithm for overlappingcommunity detection[J]. Journal of Computer Research and Development,2013,50(12):2520-2530.)
[26] WANG Y,FENG X. A potential-based node selection strategy for influence maximization in a socialnetwork[C]//Proceedings of the 2009 International Conference on Advanced Data Mining and Applications. Berlin:Springer,2009:350-361.
[27] LANCICHINETTI A, FORTUNATO S, RADICCHI F. Benchmark graphs for testingcommunity detection algorithms[J]. Physical Review E:Statistical, Nonlinear, and Soft Matter Physics,2008,78(4 Pt 2):Article No. 046110.

Overlapping community detection algorithm fusing label preprocessing and node influence

融合标签预处理与节点影响力的重叠社区发现算法

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics

[1]	Shiliang LIU, Yi WANG, Yinglong MA. Non-overlapping community detection with imbalanced community sizes [J]. Journal of Computer Applications, 2024, 44(11): 3396-3402.
[2]	Jing CHEN, Jiangchuan LIU, Nana WEI. Overlapping community detection algorithm combining K-shell and label entropy [J]. Journal of Computer Applications, 2022, 42(4): 1162-1169.
[3]	LI Xiangkun, JIA Caiyan. Collaborative filtering method fusing overlapping community regularization and implicit feedback [J]. Journal of Computer Applications, 2021, 41(1): 53-59.
[4]	SHENG Jun, LI Bin, CHEN Ling. Recommendation algorithm based on modularity and label propagation [J]. Journal of Computer Applications, 2020, 40(9): 2606-2612.
[5]	LYU Yali, MIAO Junzhong, HU Weixin. Semi-supervised learning algorithm of graph based on label metric learning [J]. Journal of Computer Applications, 2020, 40(12): 3430-3436.
[6]	ZHENG Wenping, YUE Xiangdou, YANG Gui. Improved label propagation algorithm based on random walk [J]. Journal of Computer Applications, 2020, 40(12): 3423-3429.
[7]	CHENG Qiwei, CHEN Qimai, HE Chaobo, LIU Hai. Overlapping community detection method based on improved symmetric binary nonnegative matrix factorization [J]. Journal of Computer Applications, 2020, 40(11): 3203-3210.
[8]	DU Hangyuan, PEI Xiya, WANG Wenjian. Overlapping community detection algorithm for attributed networks [J]. Journal of Computer Applications, 2019, 39(11): 3151-3157.
[9]	GU Junhua, HUO Shijie, WANG Shoubin, TIAN Zhe. Fast label propagation algorithm based on node centrality and community similarity [J]. Journal of Computer Applications, 2018, 38(5): 1320-1326.
[10]	WANG Yan, HUANG Faliang, YUAN Chang'an. Semi-synchronous communities detection algorithm based on label influence [J]. Journal of Computer Applications, 2016, 36(6): 1573-1578.
[11]	HUANG Yonghang, TANG Yong, LI Chunying, TANG Zhikang, LIU Jiwei. Academic paper recommendation model based on community partition [J]. Journal of Computer Applications, 2016, 36(5): 1279-1283.
[12]	LI Chunying, TANG Yong, TANG Zhikang, HUANG Yonghang, YUAN Chengzhe, ZHAO Jiandong. Community detection model in large scale academic social networks [J]. Journal of Computer Applications, 2015, 35(9): 2565-2568.
[13]	SHI Mengyu, ZHOU Yong, XING Yan. Community detection by label propagation with LeaderRank method [J]. Journal of Computer Applications, 2015, 35(2): 448-451.
[14]	SUN Huixia, LI Yuexin. Overlapping community discovering algorithm based on latent features [J]. Journal of Computer Applications, 2015, 35(12): 3477-3480.
[15]	HU Liying, GUO Gongde, MA Changfeng. Overlapping community discovery method based on symmetric nonnegative matrix factorization [J]. Journal of Computer Applications, 2015, 35(10): 2742-2746.