基于社区优化的深度网络嵌入方法

doi:10.11772/j.issn.1001-9081.2020081193

计算机应用 ›› 2021, Vol. 41 ›› Issue (7): 1956-1963.DOI: 10.11772/j.issn.1001-9081.2020081193

所属专题：数据科学与技术

基于社区优化的深度网络嵌入方法

李亚芳, 梁烨, 冯韦玮, 祖宝开, 康玉健

北京工业大学信息学部, 北京 100124

收稿日期:2020-08-12 修回日期:2020-12-17 出版日期:2021-07-10 发布日期:2021-01-22
通讯作者: 李亚芳
作者简介:李亚芳(1988-),女,河北沧州人,讲师,博士,CCF会员,主要研究方向:数据挖掘、复杂网络分析;梁烨(1997-),男,北京人,硕士研究生,主要研究方向:深度学习、网络数据挖掘;冯韦玮(1999-),女,北京人,主要研究方向:数据挖掘;祖宝开(1988-),女,河北辛集人,讲师,博士,主要研究方向:机器学习、数据挖掘;康玉健(1999-),男,北京人,主要研究方向:深度学习、机器学习。
基金资助:
北京市自然科学基金资助项目（4204085）；北京市教委科研计划一般项目（KM202010005015）；中国博士后科学基金资助项目（2019M650407）。

Deep network embedding method based on community optimization

LI Yafang, LIANG Ye, FENG Weiwei, ZU Baokai, KANG Yujian

Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China

Received:2020-08-12 Revised:2020-12-17 Online:2021-07-10 Published:2021-01-22
Supported by:
This work is partially supported by Beijing Municipal Natural Science Foundation (4204085), General Science and Technology Program of Beijing Municipal Education Commission (KM202010005015), China Postdoctoral Science Foundation (2019M650407).

摘要/Abstract

摘要： 随着现代网络通信和社会媒体等技术的飞速发展，网络化的大数据由于缺少高效可用的节点表示而难以应用。将高维稀疏难于应用的网络数据转化为低维、紧凑、易于应用的节点表示的网络嵌入方法受到广泛关注。然而已有网络嵌入方法得到节点低维特征向量后，再将其作为其他应用（节点分类、社区发现、链接预测、可视化等）的输入来作进一步分析，没有针对具体应用构建模型，难以取得满意的结果。针对网络社区发现这一具体应用，提出结合社区结构优化进行节点低维特征表示的深度自编码聚类模型CADNE。首先基于深度自编码模型，通过保持网络局部及全局链接的拓扑特性来学习节点的低维表示，然后利用网络聚类结构对节点低维表示进一步优化。该方法同时学习节点的低维表示和节点所属社区的指示向量，使节点的低维表示不仅能保持原始网络结构中的拓扑结构特性，而且能保持节点的聚类特性。与已有的经典网络嵌入方法进行对比，结果显示CADNE模型在Citeseer和Cora上取得最优聚类结果，在20NewsGroup上准确率提升最高达0.525；分类性能在Blogcatalog、Citeseer数据集上取得最好结果，在Blogcatalog上训练比例20%时比基线方法提升最高达0.512；并且CADNE模型在可视化对比中能够得到类边界更加清晰的节点低维表示，验证了所提方法具有较好的节点低维表示能力。

关键词: 大规模复杂网络, 社区结构, 深度学习, 节点低维表示, 网络嵌入

Abstract: With the rapid development of technologies such as modern network communication and social media, the networked big data is difficult to be applied due to the lack of efficient and available node representation. Network representation learning is widely concerned by transforming high-dimensional sparse network data into low-dimensional, compact and easy-to-apply node representation. However, the existing network embedding methods obtain the low-dimensional feature vectors of nodes and then use them as the inputs for other applications (such as node classification, community discovery, link prediction and visualization) for further analysis, without building models for specific applications, which makes it difficult to achieve satisfactory results. For the specific application of network community discovery, a deep auto-encoder clustering model that combines community structure optimization for low-dimensional feature representation of nodes was proposed, namely Community-Aware Deep Network Embedding (CADNE). Firstly, based on the deep auto-encoder model, the node low-dimensional representation was learned by maintaining the topological characteristics of the local and global links of the network, and then the low-dimensional representation of the nodes was further optimized by using the network clustering structure. In this method, the low-dimensional representations of the nodes and the indicator vectors of the communities that the nodes belong to were learnt at the same time, so that the low-dimensional representation of the nodes can not only maintain the topological characteristics of the original network structure, but also maintain the clustering characteristics of the nodes. Comparing with the existing classical network embedding methods, the results show that CADNE achieves the best clustering results on Citeseer and Cora datasets, and improves the accuracy by up to 0.525 on 20NewsGroup. In classification task, CADNE performs the best on Blogcatalog and Citeseer datasets and the performance on Blogcatalog is improved by up to 0.512 with 20% training samples. In the visualization comparison, CADNE molel can get a low-dimensional representation of nodes with clearer class boundary, which verifies that the proposed method has better low-dimensional representation ability of nodes.

Key words: large-scale complex network, community structure, deep learning, node low-dimensional representation, network embedding

中图分类号:

TP391

李亚芳, 梁烨, 冯韦玮, 祖宝开, 康玉健. 基于社区优化的深度网络嵌入方法[J]. 计算机应用, 2021, 41(7): 1956-1963.

LI Yafang, LIANG Ye, FENG Weiwei, ZU Baokai, KANG Yujian. Deep network embedding method based on community optimization[J]. Journal of Computer Applications, 2021, 41(7): 1956-1963.

参考文献

[1] BOTHOREL C,CRUZ J D,MAGNANI M,et al. Clustering attributed graphs:models,measures and methods[J]. Network Science,2015,3(3):408-444.
[2] 齐金山, 梁循, 李志宇, 等. 大规模复杂信息网络表示学习:概念、方法与挑战[J]. 计算机学报,2018,41(10):2394-2420. (QI J S,LIANG X,LI Z Y,et al. Representation learning of largescale complex information network:concepts, methods and challenges[J]. Chinese Journal of Computers,2018,41(10):2394-2420.)
[3] CUI P,WANG X,PEI J,et al. A survey on network embedding[J]. IEEE Transactions on Knowledge and Data Engineering, 2019,31(5):833-852.
[4] 尹赢, 吉立新, 黄瑞阳, 等. 网络表示学习的研究与发展[J]. 网络与信息安全学报,2019,5(2):77-87.(YIN Y,JI L X, HUANG R Y, et al. Research and development of network representation learning[J]. Chinese Journal of Network and Information Security,2019,5(2):77-87.)
[5] ROWEIS S T,SUAL L K. Nonlinear dimensionality reduction by locally linear embedding[J]. Science,2000,290(5500):2323-2326.
[6] BELKIN M, NIYOGI P. Laplacian eigenmaps and spectral techniques for embedding and clustering[C]//Proceedings of the 14th International Conference on Neural Information Processing Systems:Natural and Synthetic. Cambridge:MIT Press,2001:585-591
[7] SHAW B, JEBARA T, Structure preserving embedding[C]//Proceedings of the 26th Annual International Conference on Machine Learning. New York:ACM,2009:937-944.
[8] AHMED A,SHERVASHIDZE N,NARAYANAMURTHY S,et al. Distributed large-scale natural graph factorization[C]//Proceedings of the 22nd International Conference on World Wide Web. New York:ACM,2013:37-48.
[9] CAO S,LU W,XU Q. GraRep:learning graph representations with global structural information[C]//Proceedings of the 24th ACM International Conference on Information and Knowledge Management. New York:ACM,2015:891-900.
[10] OU M,CUI P,PEI J,et al. Asymmetric transitivity preserving graph embedding[C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York:ACM,2016:1105-1114.
[11] WANG X, CUI P, WANG J, et al. Community preserving network embedding[C]//Proceedings of the 31st AAAI Conference on Artificial Intelligence. Palo Alto,CA:AAAI Press,2017:203-209.
[12] LI Y,WANG Y,ZHANG T,et al. Learning network embedding with community structural information[C]//Proceedings of the 28th International Joint Conference on Artificial Intelligence. Palo Alto,CA:AAAI Press,2019:2937-2943.
[13] YANG C, SUN M, LIU Z, et al. Fast network embedding enhancement via high order proximity approximation[C]//Proceedings of 26th International Joint Conference on Artificial Intelligence. San Mateo,CA:IJCAI,2017:3894-3900.
[14] PEROZZI B, Al-RFOU R, SKIENA S. DeepWalk:online learning of social representations[C]//Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York:ACM,2014:701-710.
[15] GROVER A,LESKOVEC J. node2vec:scalable feature learning for networks[C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York:ACM,2016:855-864.
[16] CHEN H, PEROZZI B, HU Y, et al. HARP:hierarchical representation learning for networks[C]//Proceedings of 32nd AAAI Conference on Artificial Intelligence. Palo Alto,CA:AAAI Press,2018:2127-2134.
[17] LI J,ZHU J,ZHANG B. Discriminative deep random walk for network classification[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguist. Stroudsburg, PA:Association for Computational Linguistics, 2016:1004-1013.
[18] 陈丽, 朱裴松, 钱铁云, 等. 基于边采样的网络表示学习模型[J]. 软件学报,2018,29(3):756-771.(CHEN L,ZHU P S, QIAN T Y,et al. Edge sampling based network embedding model[J]. Journal of Software,2018,29(3):756-771.)
[19] WANG D,CUI P,ZHU W. Structural deep network embedding[C]//Proceedings of the 22nd International Conference on Knowledge Discovery and Data Mining. New York:ACM,2016:1225-1234.
[20] CAO S,LU W,XU Q. Deep neural networks for learning graph representations[C]//Proceedings of the 30th AAAI Conference on Artificial Intelligence. Palo Alto,CA:AAAI Press,2016:1145-1152.)
[21] XIE J,GIRSHICK R,FARHADI A,et al. Unsupervised deep embedding for clustering analysis[C]//Proceedings of the 33rd International Conference on Machine Learning. New York:JMLR. org,2016:478-487.
[22] STREHL A,GHOSH J,Cluster ensembles-a knowledge reuse framework for combining multiple partitions[J]. Journal of Machine Learning Research,2002,3:583-617.

基于社区优化的深度网络嵌入方法

Deep network embedding method based on community optimization

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

[1]	郑志强, 胡鑫, 翁智, 王雨禾, 程曦. 基于改进DenseNet的牛眼图像特征提取方法[J]. 计算机应用, 2021, 41(9): 2780-2784.
[2]	谢德峰, 吉建民. 融入句法感知表示进行句法增强的语义解析[J]. 计算机应用, 2021, 41(9): 2489-2495.
[3]	代雨柔, 杨庆, 张凤荔, 周帆. 基于自监督学习的社交网络用户轨迹预测模型[J]. 计算机应用, 2021, 41(9): 2545-2551.
[4]	赵宏, 孔东一. 图像特征注意力与自适应注意力融合的图像内容中文描述[J]. 计算机应用, 2021, 41(9): 2496-2503.
[5]	徐江浪, 李林燕, 万新军, 胡伏原. 结合目标检测的室内场景识别方法[J]. 计算机应用, 2021, 41(9): 2720-2725.
[6]	陈成瑞, 孙宁, 何世彪, 廖勇. 面向C-V2X通信的基于深度学习的联合信道估计与均衡算法[J]. 计算机应用, 2021, 41(9): 2687-2693.
[7]	曹玉红, 徐海, 刘荪傲, 王紫霄, 李宏亮. 基于深度学习的医学影像分割研究综述[J]. 计算机应用, 2021, 41(8): 2273-2287.
[8]	秦斌斌, 彭良康, 卢向明, 钱江波. 司机分心驾驶检测研究进展[J]. 计算机应用, 2021, 41(8): 2330-2337.
[9]	何正海, 线岩团, 王蒙, 余正涛. 融合句法指导与字符注意力机制的案情阅读理解方法[J]. 计算机应用, 2021, 41(8): 2427-2431.
[10]	侯笑晗, 金国栋, 谭力宁, 薛远亮. 基于自适应和最优特征的合成孔径雷达舰船检测方法[J]. 计算机应用, 2021, 41(7): 2150-2155.
[11]	高钦泉, 黄炳城, 刘文哲, 童同. 基于改进CenterNet的竹条表面缺陷检测方法[J]. 计算机应用, 2021, 41(7): 1933-1938.
[12]	王月, 江逸茗, 兰巨龙. 基于改进三元组网络和K近邻算法的入侵检测[J]. 计算机应用, 2021, 41(7): 1996-2002.
[13]	杜炎, 吕良福, 焦一辰. 基于模糊推理的模糊原型网络[J]. 计算机应用, 2021, 41(7): 1885-1890.
[14]	刘世泽, 朱奕达, 陈润泽, 罗海勇, 赵方, 孙艺, 王宝会. 基于残差时域注意力神经网络的交通模式识别算法[J]. 计算机应用, 2021, 41(6): 1557-1565.
[15]	黄梨, 卢龙. 基于长距离依赖编码与深度残差U-Net的缺血性卒中病灶分割[J]. 计算机应用, 2021, 41(6): 1820-1827.