Auto-encoder based multi-view attributed network representation learning model

doi:10.11772/j.issn.1001-9081.2020061006

Abstract

Abstract: Most of the traditional network representation learning methods cannot consider the rich structure information and attribute information in the network at the same time, resulting in poor performance of subsequent tasks such as classification and clustering. In order to solve this problem, an Auto-Encoder based Multi-View Attributed Network Representation learning model(AE-MVANR) was proposed. Firstly, the topological structure information of the network was transformed into the Topological Structure View(TSV), and the co-occurrence frequencies of the same attributes between nodes were calculated to construct the Attributed Structure View(ASV). Then, the random walk algorithm was used to obtain a series of node sequences on two views separately. At last, by inputting all the generated sequences into an auto-encoder model for training, the node representation vectors that integrate structure information and attribute information were obtained. Extensive experiments of classification and clustering tasks on several real-world datasets were carried out. The results demonstrate that AE-MVANR outperforms the widely used network representation learning method based solely on structure information and the one based on both network structure information and node attribute information. In specific, for classification results of the proposed model, the maximum increase of accuracy is 43.75%, and for clustering results of the proposed model, the maximum increase of Normalized Mutual Information(NMI) is 137.95%, the maximum increase of Silhouette Coefficient is 1 314.63% and the maximum decrease of Davies Bouldin Index(DBI) is 45.99%.

Key words: network representation learning, network embedding, node representation vector, multi-view attributed network, auto-encoder

摘要： 现有的大多数网络表示学习方法很难兼顾网络中丰富的结构信息和属性信息，导致其后续任务，如分类、聚类等的效果不佳。针对此问题，提出一种基于自编码器的多视图属性网络表示学习模型（AE-MVANR）。首先，将网络的拓扑结构信息转化为拓扑结构视图（TSV），通过计算节点间相同属性共现频率来构造属性结构视图（ASV）；然后，在两个视图上分别利用随机游走算法得到若干节点序列；最后，经过自编码器训练得到的序列，从而得到融合了结构信息和属性信息的节点表示向量。在几个真实数据集上进行了分类、聚类任务的大量实验，结果表明，所提AE-MVANR优于常用的仅基于网络结构的和同时基于网络结构信息及节点属性信息的网络表示学习方法，具体来说该模型的分类准确率最高提升43.75%，而其聚类结果的标准化互信息（NMI）和轮廓系数（Silhouette Coefficient）指标最高增幅分别为137.95%和1 314.63%，戴维森堡丁指数（DBI）最大降幅达45.99%。

关键词: 网络表示学习, 网络嵌入, 节点表示向量, 多视图属性网络, 自编码器

CLC Number:

TP183

FAN Wei, WANG Huimin, XING Yan. Auto-encoder based multi-view attributed network representation learning model[J]. Journal of Computer Applications, 2021, 41(4): 1064-1070.

樊玮, 王慧敏, 邢艳. 基于自编码器的多视图属性网络表示学习模型[J]. 计算机应用, 2021, 41(4): 1064-1070.

References

[1] 刘正铭, 马宏, 刘树新, 等. 一种融合节点文本属性信息的网络表示学习算法[J]. 计算机工程,2018,44(11):165-171.(LIU Z M,MA H,LIU S X,et al. A network representation learning algorithm fusing with textual attribute information of nodes[J]. Computer Engineering,2018,44(11):165-171.)
[2] 张璞, 柴变芳, 张静, 等. 半监督属性网络表示学习方法[J]. 计算机工程与应用,2019,55(12):117-123,144.(ZHANG P,CHAI B F,ZHANG J,et al. Semi-supervised representation learning method for attributed networks[J]. Computer Engineering and Applications,2019,55(12):117-123,144.)
[3] 刘思, 刘海, 陈启买, 等. 基于网络表示学习与随机游走的链路预测算法[J]. 计算机应用,2017,37(8):2234-2239.(LIU S,LIU H,CHEN Q M,et al. Link prediction algorithm based on network representation learning and random walk[J]. Journal of Computer Applications,2017,37(8):2234-2239.)
[4] 涂存超, 杨成, 刘知远, 等. 网络表示学习综述[J]. 中国科学:信息科学,2017,47(8):980-996.(TU C C,YANG C,LIU Z Y, et al. Network representation learning:an overview[J]. SCIENTIA SINICA Informationis,2017,47(8):980-996.)
[5] ROWEIS S T,SAUL L K. Nonlinear dimensionality reduction by locally linear embedding[J]. Science,2000,290(5500):2323-2326.
[6] BELKIN M, NIYOGI P. Laplacian eigenmaps and spectral techniques for embedding and clustering[C]//Proceedings of the 14th International Conference on Neural Information Processing Systems:Natural and Synthetic. Cambridge:MIT Press,2001:585-591.
[7] MIKOLOV T,CHEN K,CORRADO G,et al. Efficient estimation of word representations in vector space[EB/OL].[2020-03-10]. https://arxiv.org/pdf/1301.3781v3.pdf.
[8] MIKOLOV T, SUTSKEVER I, CHEN K, et al. Distributed representations of words and phrases and their compositionality[C]//Proceedings of the 26th International Conference on Neural Information Processing Systems. Red Hook, NY:Curran Associates Inc.,2013:3111-3119.
[9] PEROZZI B, AL-RFOU R, SKIENA S. DeepWalk:online learning of social representations[C]//Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York:ACM,2014:701-710.
[10] TANG J,QU M,WANG M,et al. LINE:large-scale information network embedding[C]//Proceedings of the 24th International Conference on World Wide Web. Republic and Canton of Geneva:International World Wide Web Conferences Steering Committee, 2015:1067-1077.
[11] GROVER A,LESKOVEC J. node2vec:scalable feature learning for networks[C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York:ACM,2016:855-864.
[12] CAO S,LU W,XU Q. GraRep:learning graph representations with global structural information[C]//Proceedings of the 24th ACM International Conference on Information and Knowledge Management. New York:ACM,2015:891-900.
[13] WANG D,CUI P,ZHU W. Structural deep network embedding[C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York:ACM,2016:1225-1234.
[14] YANG C, LIU Z, ZHAO D, et al. Network representation learning with rich text information[C]//Proceedings of the 24th International Joint Conference on Artificial Intelligence. Palo Alto,CA:AAAI Press,2015:2111-2117.
[15] HUANG X, LI J, HU X. Accelerated attributed network embedding[C]//Proceedings of the 2017 SIAM International Conference on Data Mining. Philadelphia,PA:SIAM,2017:633-641.
[16] YANG H,PAN S,ZHANG P,et al. Binarized attributed network embedding[C]//Proceedings of the 2018 IEEE International Conference on Data Mining. Piscataway:IEEE,2018:1476-1481.
[17] TU C, ZHANG W, LIU Z, et al. Max-margin DeepWalk:discriminative learning of network representation[C]//Proceedings of the 25th International Joint Conference on Artificial Intelligence. Palo Alto,CA:AAAI Press,2016:3889-3895.
[18] HONG R, HE Y, WU L, et al. Deep attributed network embedding by preserving structure and attribute information[J]. IEEE Transactions on Systems,Man,and Cybernetics:Systems, 2019:1-12.
[19] LIU J,HE Z,WEI L,et al. Content to node:self-translation network embedding[C]//Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York:ACM,2018:1794-1802.
[20] LIAO L,HE X,ZHANG H,et al. Attributed social network embedding[J]. IEEE Transactions on Knowledge and Data Engineering,2018,30(12):2257-2270.
[21] QU M, TANG J, SHANG J, et al. An attention-based collaboration framework for multi-view network representation learning[C]//Proceedings of the 2017 ACM Conference on Information and Knowledge Management. New York:ACM, 2017:1767-1776.
[22] SUN Y,WANG S,HSIEH T Y,et al. MEGAN:a generative adversarial network for multi-view network embedding[C]//Proceedings of the 28th International Joint Conference on Artificial Intelligence. San Francisco:Morgan Kaufmann,2019:3527-3533.
[23] TU C,LIU H,LIU Z,et al. CANE:context-aware network embedding for relation modeling[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA:Association for Computational Linguistics, 2017:1722-1731.
[24] HE Z,LIU J,LI N,et al. Learning network-to-network model for content-rich network embedding[C]//Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York:ACM,2019:1037-1045.
[25] 王慧敏. 基于边信息提取的网络表示学习研究[D]. 天津:中国民航大学,2020:27-37.(WANG H M. Research on Network Representation Learning Based on Side Information Extraction[D]. Tianjin:Civil Aviation University of China,2020:27-37.)
[26] HOCHREITER S,SCHMIDHUBER J. Long short-term memory[J]. Neural Computation,1997,9(8):1735-1780.