Deep network embedding method based on community optimization

doi:10.11772/j.issn.1001-9081.2020081193

Journal of Computer Applications ›› 2021, Vol. 41 ›› Issue (7): 1956-1963.DOI: 10.11772/j.issn.1001-9081.2020081193

Special Issue: 数据科学与技术

• Data science and technology • Previous Articles Next Articles

Deep network embedding method based on community optimization

LI Yafang, LIANG Ye, FENG Weiwei, ZU Baokai, KANG Yujian

Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China

Received:2020-08-12 Revised:2020-12-17 Online:2021-07-10 Published:2021-01-22
Supported by:
This work is partially supported by Beijing Municipal Natural Science Foundation (4204085), General Science and Technology Program of Beijing Municipal Education Commission (KM202010005015), China Postdoctoral Science Foundation (2019M650407).

基于社区优化的深度网络嵌入方法

李亚芳, 梁烨, 冯韦玮, 祖宝开, 康玉健

北京工业大学信息学部, 北京 100124

通讯作者: 李亚芳
作者简介:李亚芳(1988-),女,河北沧州人,讲师,博士,CCF会员,主要研究方向:数据挖掘、复杂网络分析;梁烨(1997-),男,北京人,硕士研究生,主要研究方向:深度学习、网络数据挖掘;冯韦玮(1999-),女,北京人,主要研究方向:数据挖掘;祖宝开(1988-),女,河北辛集人,讲师,博士,主要研究方向:机器学习、数据挖掘;康玉健(1999-),男,北京人,主要研究方向:深度学习、机器学习。
基金资助:
北京市自然科学基金资助项目（4204085）；北京市教委科研计划一般项目（KM202010005015）；中国博士后科学基金资助项目（2019M650407）。

Abstract

Abstract: With the rapid development of technologies such as modern network communication and social media, the networked big data is difficult to be applied due to the lack of efficient and available node representation. Network representation learning is widely concerned by transforming high-dimensional sparse network data into low-dimensional, compact and easy-to-apply node representation. However, the existing network embedding methods obtain the low-dimensional feature vectors of nodes and then use them as the inputs for other applications (such as node classification, community discovery, link prediction and visualization) for further analysis, without building models for specific applications, which makes it difficult to achieve satisfactory results. For the specific application of network community discovery, a deep auto-encoder clustering model that combines community structure optimization for low-dimensional feature representation of nodes was proposed, namely Community-Aware Deep Network Embedding (CADNE). Firstly, based on the deep auto-encoder model, the node low-dimensional representation was learned by maintaining the topological characteristics of the local and global links of the network, and then the low-dimensional representation of the nodes was further optimized by using the network clustering structure. In this method, the low-dimensional representations of the nodes and the indicator vectors of the communities that the nodes belong to were learnt at the same time, so that the low-dimensional representation of the nodes can not only maintain the topological characteristics of the original network structure, but also maintain the clustering characteristics of the nodes. Comparing with the existing classical network embedding methods, the results show that CADNE achieves the best clustering results on Citeseer and Cora datasets, and improves the accuracy by up to 0.525 on 20NewsGroup. In classification task, CADNE performs the best on Blogcatalog and Citeseer datasets and the performance on Blogcatalog is improved by up to 0.512 with 20% training samples. In the visualization comparison, CADNE molel can get a low-dimensional representation of nodes with clearer class boundary, which verifies that the proposed method has better low-dimensional representation ability of nodes.

Key words: large-scale complex network, community structure, deep learning, node low-dimensional representation, network embedding

摘要： 随着现代网络通信和社会媒体等技术的飞速发展，网络化的大数据由于缺少高效可用的节点表示而难以应用。将高维稀疏难于应用的网络数据转化为低维、紧凑、易于应用的节点表示的网络嵌入方法受到广泛关注。然而已有网络嵌入方法得到节点低维特征向量后，再将其作为其他应用（节点分类、社区发现、链接预测、可视化等）的输入来作进一步分析，没有针对具体应用构建模型，难以取得满意的结果。针对网络社区发现这一具体应用，提出结合社区结构优化进行节点低维特征表示的深度自编码聚类模型CADNE。首先基于深度自编码模型，通过保持网络局部及全局链接的拓扑特性来学习节点的低维表示，然后利用网络聚类结构对节点低维表示进一步优化。该方法同时学习节点的低维表示和节点所属社区的指示向量，使节点的低维表示不仅能保持原始网络结构中的拓扑结构特性，而且能保持节点的聚类特性。与已有的经典网络嵌入方法进行对比，结果显示CADNE模型在Citeseer和Cora上取得最优聚类结果，在20NewsGroup上准确率提升最高达0.525；分类性能在Blogcatalog、Citeseer数据集上取得最好结果，在Blogcatalog上训练比例20%时比基线方法提升最高达0.512；并且CADNE模型在可视化对比中能够得到类边界更加清晰的节点低维表示，验证了所提方法具有较好的节点低维表示能力。

关键词: 大规模复杂网络, 社区结构, 深度学习, 节点低维表示, 网络嵌入

CLC Number:

TP391

LI Yafang, LIANG Ye, FENG Weiwei, ZU Baokai, KANG Yujian. Deep network embedding method based on community optimization[J]. Journal of Computer Applications, 2021, 41(7): 1956-1963.

李亚芳, 梁烨, 冯韦玮, 祖宝开, 康玉健. 基于社区优化的深度网络嵌入方法[J]. 计算机应用, 2021, 41(7): 1956-1963.

References

[1] BOTHOREL C,CRUZ J D,MAGNANI M,et al. Clustering attributed graphs:models,measures and methods[J]. Network Science,2015,3(3):408-444.
[2] 齐金山, 梁循, 李志宇, 等. 大规模复杂信息网络表示学习:概念、方法与挑战[J]. 计算机学报,2018,41(10):2394-2420. (QI J S,LIANG X,LI Z Y,et al. Representation learning of largescale complex information network:concepts, methods and challenges[J]. Chinese Journal of Computers,2018,41(10):2394-2420.)
[3] CUI P,WANG X,PEI J,et al. A survey on network embedding[J]. IEEE Transactions on Knowledge and Data Engineering, 2019,31(5):833-852.
[4] 尹赢, 吉立新, 黄瑞阳, 等. 网络表示学习的研究与发展[J]. 网络与信息安全学报,2019,5(2):77-87.(YIN Y,JI L X, HUANG R Y, et al. Research and development of network representation learning[J]. Chinese Journal of Network and Information Security,2019,5(2):77-87.)
[5] ROWEIS S T,SUAL L K. Nonlinear dimensionality reduction by locally linear embedding[J]. Science,2000,290(5500):2323-2326.
[6] BELKIN M, NIYOGI P. Laplacian eigenmaps and spectral techniques for embedding and clustering[C]//Proceedings of the 14th International Conference on Neural Information Processing Systems:Natural and Synthetic. Cambridge:MIT Press,2001:585-591
[7] SHAW B, JEBARA T, Structure preserving embedding[C]//Proceedings of the 26th Annual International Conference on Machine Learning. New York:ACM,2009:937-944.
[8] AHMED A,SHERVASHIDZE N,NARAYANAMURTHY S,et al. Distributed large-scale natural graph factorization[C]//Proceedings of the 22nd International Conference on World Wide Web. New York:ACM,2013:37-48.
[9] CAO S,LU W,XU Q. GraRep:learning graph representations with global structural information[C]//Proceedings of the 24th ACM International Conference on Information and Knowledge Management. New York:ACM,2015:891-900.
[10] OU M,CUI P,PEI J,et al. Asymmetric transitivity preserving graph embedding[C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York:ACM,2016:1105-1114.
[11] WANG X, CUI P, WANG J, et al. Community preserving network embedding[C]//Proceedings of the 31st AAAI Conference on Artificial Intelligence. Palo Alto,CA:AAAI Press,2017:203-209.
[12] LI Y,WANG Y,ZHANG T,et al. Learning network embedding with community structural information[C]//Proceedings of the 28th International Joint Conference on Artificial Intelligence. Palo Alto,CA:AAAI Press,2019:2937-2943.
[13] YANG C, SUN M, LIU Z, et al. Fast network embedding enhancement via high order proximity approximation[C]//Proceedings of 26th International Joint Conference on Artificial Intelligence. San Mateo,CA:IJCAI,2017:3894-3900.
[14] PEROZZI B, Al-RFOU R, SKIENA S. DeepWalk:online learning of social representations[C]//Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York:ACM,2014:701-710.
[15] GROVER A,LESKOVEC J. node2vec:scalable feature learning for networks[C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York:ACM,2016:855-864.
[16] CHEN H, PEROZZI B, HU Y, et al. HARP:hierarchical representation learning for networks[C]//Proceedings of 32nd AAAI Conference on Artificial Intelligence. Palo Alto,CA:AAAI Press,2018:2127-2134.
[17] LI J,ZHU J,ZHANG B. Discriminative deep random walk for network classification[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguist. Stroudsburg, PA:Association for Computational Linguistics, 2016:1004-1013.
[18] 陈丽, 朱裴松, 钱铁云, 等. 基于边采样的网络表示学习模型[J]. 软件学报,2018,29(3):756-771.(CHEN L,ZHU P S, QIAN T Y,et al. Edge sampling based network embedding model[J]. Journal of Software,2018,29(3):756-771.)
[19] WANG D,CUI P,ZHU W. Structural deep network embedding[C]//Proceedings of the 22nd International Conference on Knowledge Discovery and Data Mining. New York:ACM,2016:1225-1234.
[20] CAO S,LU W,XU Q. Deep neural networks for learning graph representations[C]//Proceedings of the 30th AAAI Conference on Artificial Intelligence. Palo Alto,CA:AAAI Press,2016:1145-1152.)
[21] XIE J,GIRSHICK R,FARHADI A,et al. Unsupervised deep embedding for clustering analysis[C]//Proceedings of the 33rd International Conference on Machine Learning. New York:JMLR. org,2016:478-487.
[22] STREHL A,GHOSH J,Cluster ensembles-a knowledge reuse framework for combining multiple partitions[J]. Journal of Machine Learning Research,2002,3:583-617.

Deep network embedding method based on community optimization

基于社区优化的深度网络嵌入方法

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics

[1]	XIE Defeng, JI Jianmin. Syntax-enhanced semantic parsing with syntax-aware representation [J]. Journal of Computer Applications, 2021, 41(9): 2489-2495.
[2]	DAI Yurou, YANG Qing, ZHANG Fengli, ZHOU Fan. Trajectory prediction model of social network users based on self-supervised learning [J]. Journal of Computer Applications, 2021, 41(9): 2545-2551.
[3]	ZHENG Zhiqiang, HU Xin, WENG Zhi, WANG Yuhe, CHENG Xi. Cattle eye image feature extraction method based on improved DenseNet [J]. Journal of Computer Applications, 2021, 41(9): 2780-2784.
[4]	ZHAO Hong, KONG Dongyi. Chinese description of image content based on fusion of image feature attention and adaptive attention [J]. Journal of Computer Applications, 2021, 41(9): 2496-2503.
[5]	XU Jianglang, LI Linyan, WAN Xinjun, HU Fuyuan. Indoor scene recognition method combined with object detection [J]. Journal of Computer Applications, 2021, 41(9): 2720-2725.
[6]	CHEN Chengrui, SUN Ning, HE Shibiao, LIAO Yong. Deep learning-based joint channel estimation and equalization algorithm for C-V2X communications [J]. Journal of Computer Applications, 2021, 41(9): 2687-2693.
[7]	CAO Yuhong, XU Hai, LIU Sun'ao, WANG Zixiao, LI Hongliang. Review of deep learning-based medical image segmentation [J]. Journal of Computer Applications, 2021, 41(8): 2273-2287.
[8]	QIN Binbin, PENG Liangkang, LU Xiangming, QIAN Jiangbo. Research progress on driver distracted driving detection [J]. Journal of Computer Applications, 2021, 41(8): 2330-2337.
[9]	HE Zhenghai, XIAN Yantuan, WANG Meng, YU Zhengtao. Case reading comprehension method combining syntactic guidance and character attention mechanism [J]. Journal of Computer Applications, 2021, 41(8): 2427-2431.
[10]	HOU Xiaohan, JIN Guodong, TAN Lining, XUE Yuanliang. Synthetic aperture radar ship detection method based on self-adaptive and optimal features [J]. Journal of Computer Applications, 2021, 41(7): 2150-2155.
[11]	WANG Yue, JIANG Yiming, LAN Julong. Intrusion detection based on improved triplet network and K-nearest neighbor algorithm [J]. Journal of Computer Applications, 2021, 41(7): 1996-2002.
[12]	GAO Qinquan, HUANG Bingcheng, LIU Wenzhe, TONG Tong. Bamboo strip surface defect detection method based on improved CenterNet [J]. Journal of Computer Applications, 2021, 41(7): 1933-1938.
[13]	DU Yan, LYU Liangfu, JIAO Yichen. Fuzzy prototype network based on fuzzy reasoning [J]. Journal of Computer Applications, 2021, 41(7): 1885-1890.
[14]	SHI Yangxiao, ZHANG Jun, CHEN Peng, WANG Bing. Classification of steel surface defects based on lightweight network [J]. Journal of Computer Applications, 2021, 41(6): 1836-1841.
[15]	LIU Shize, ZHU Yida, CHEN Runze, LUO Haiyong, ZHAO Fang, SUN Yi, WANG Baohui. Traffic mode recognition algorithm based on residual temporal attention neural network [J]. Journal of Computer Applications, 2021, 41(6): 1557-1565.