Network representation learning algorithm based on improved random walk

doi:10.11772/j.issn.1001-9081.2018071509

Journal of Computer Applications ›› 2019, Vol. 39 ›› Issue (3): 651-655.DOI: 10.11772/j.issn.1001-9081.2018071509

Previous Articles Next Articles

Network representation learning algorithm based on improved random walk

WANG Wentao, HUANG Ye, WU Lintao, KE Xuan, TANG Wan

College of Computer Science, South-Central University for Nationalities, Wuhan Hubei 430074, China

Received:2018-07-23 Revised:2018-09-03 Online:2019-03-11 Published:2019-03-10
Contact: 黄烨
Supported by:
This work is partially supported by the National Natural Science Foundation of China (61103248), the Fundamental Research Funds for the Central Universities of South-Central University for Nationalities (CZY18014), the Innovative Research Program for Graduates of South-Central University for Nationalities (2018sycxjj269).

基于改进随机游走的网络表示学习算法

王文涛, 黄烨, 吴淋涛, 柯璇, 唐菀

中南民族大学计算机科学学院, 武汉 430074

作者简介:王文涛(1967-),男,河北邯郸人,副教授,博士,主要研究方向:计算机网络与控制;黄烨(1993-),男,湖北大悟人,硕士研究生,主要研究方向:知识表示、神经网络;吴淋涛(1994-),男,湖南茶陵人,硕士研究生,主要研究方向:数据挖掘;柯璇(1994-),女,湖北阳新人,硕士研究生,主要研究方向:人工智能;唐菀(1974-),女,贵州都匀人,教授,博士,主要研究方向:光/无线网络协议、网络安全。
基金资助:
国家自然科学基金资助项目（61103248）；中南民族大学中央高校基本科研业务费专项（CZY18014）；中南民族大学研究生创新基金资助项目（2018sycxjj269）。

Abstract

Abstract: Existing Word2vec-based Network Representation Learning (NRL) algorithms use a Random Walk (RW) to generate node sequence. The RW tends to select nodes with larger degrees, so that the node sequence can not reflect the network structure information well, decreasing the performance of the algorithm. To solve the problem, a new network representation learning algorithm based on improved random walk was proposed. Firstly, RLP-MHRW (Remove self-Loop Probability for Metropolis-Hastings Random Walk) was used to generate node sequence. This algorithm would not favor nodes with larger degrees while forming a node sequence, so that the obtained sequence can efficiently reflect the network structure information. Then, the node sequence was put into Skip-gram model to obtain the node representation vector. Finally, the performance of the network representation learning algorithm was measured by a link prediction task. Contrast experiment has been performed in four real network datasets. Compared with LINE (Large-scale Information Network Embedding) and node2vec on arXiv ASTRO-PH, the AUC (Area Under Curve) value of link prediction has increased by 8.9% and 3.5% respectively, and so do the other datasets. Experimental results show that RLP-MHRW can effectively improve the performance of the network representation learning algorithm based on Word2vec.

Key words: Network Representation Learning (NRL), Random Walk (RW), link prediction, unbiased sampling, Machine Learning (ML)

摘要： 现有的基于Word2vec的网络表示学习（NRL）算法使用随机游走（RW）来生成节点序列，针对随机游走倾向于选择具有较大度的节点，生成的节点序列不能很好地反映网络结构信息，从而影响表示学习性能的问题，提出了基于改进随机游走的网络表示学习算法。首先，使用RLP-MHRW算法生成节点序列，它在生成节点序列时不会偏向大度节点，得到的节点序列能更好地反映网络结构信息；然后，将节点序列投入到Skip-gram模型得到节点表示向量；最后，利用链路预测任务来测度表示学习性能。在4个真实网络数据集上进行了实验。在论文合作网络arXiv ASTRO-PH上与LINE和node2vec算法相比，链路预测的AUC值分别提升了8.9%和3.5%，其他数据集上也均有提升。实验结果表明，RLP-MHRW能有效提高基于Word2vec的网络表示学习算法的性能。

关键词: 网络表示学习, 随机游走, 链路预测, 无偏采样, 机器学习

CLC Number:

TP391
TP18

WANG Wentao, HUANG Ye, WU Lintao, KE Xuan, TANG Wan. Network representation learning algorithm based on improved random walk[J]. Journal of Computer Applications, 2019, 39(3): 651-655.

王文涛, 黄烨, 吴淋涛, 柯璇, 唐菀. 基于改进随机游走的网络表示学习算法[J]. 计算机应用, 2019, 39(3): 651-655.

References

[1] HASAN M A, ZAKI M J. A survey of link prediction in social networks[M]//AGGARWAL C C. Social Network Data Analytics. Berlin:Springer, 2011:243-275.
[2] GROVER A, LESKOVEC J. Node2vec:scalable feature learning for networks[C]//KDD'16:Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York:ACM, 2016:855-864.
[3] LU Y L, ZHOU T. Link prediction in complex networks:a survey[J]. Physica A:Statistical Mechanics and Its Applications, 2011, 390(6):1150-1170.
[4] 涂存超,杨成,刘知远,等.网络表示学习综述[J].中国科学:信息科学,2017(8):980-996.(TU C C, YANG C, LIU Z Y,et al. Network representation learning:an overview[J]. SCIENTIA SINICA Informationis, 2017(8):980-996.)
[5] ZHANG D, YIN J, ZHU X. Network representation learning:a survey[J]. IEEE Transactions on Big Data, 2017,99:1.
[6] MIKOLOV T, SUTSKEVER I, CHEN K, et al. Distributed representations of words and phrases and their compositionality[C]//NIPS'13:Proceedings of the 26th International Conference on Neural Information Processing Systems. North Miami Beach, FL:Curran Associates, 2013, 2:3111-3119.
[7] PEROZZI B, AL-RFOU R, SKIENA S. DeepWalk:online learning of social representations[C]//Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York:ACM, 2014:701-710.
[8] TANG J, QU M, WANG M, et al. LINE:large-scale information network embedding[C]//WWW'15:Proceedings of the 24th International Conference on World Wide Web. Geneva, Switzerland:International World Wide Web Conferences Steering Committee, 2015:1067-1077.
[9] CHIB S, GREENBERG E. Understanding the metropolis-hastings algorithm[J]. Americian Statistician, 1995, 49(4):327-335.
[10] GJOKA M, KURANT M, BUTTS C T, et al. Walking in Facebook:a case study of unbiased sampling of OSNs[C]//INFOCOM'10:Proceedings of the 29th Conference on Information Communications. Piscataway, NJ:IEEE, 2010:2498-2506.
[11] 王栋,李振宇,谢高岗.在线社会网络无偏采样技术[J].计算机研究与发展,2016,53(5):949-967.(WANG D, LI Z Y, XIE G G. Unbiased sampling technologies on online social network[J]. Journal of Computer Research and Development, 2016, 53(5):949-967.)
[12] LESKOVEC J, KREVL A. SNAP datasets:Stanford large network dataset collection[DB/OL].[2017-07-01]. http://snap.stanford.edu/data.

[1]	Keke WANG, Yu ZHU, Xiaoying WANG, Jianqiang HUANG, Tengfei CAO. Heterogeneous hypernetwork representation learning method with hyperedge constraint [J]. Journal of Computer Applications, 2023, 43(12): 3654-3661.
[2]	Kun FU, Yuhan HAO, Minglei SUN, Yinghua LIU. Network representation learning based on autoencoder with optimized graph structure [J]. Journal of Computer Applications, 2023, 43(10): 3054-3061.
[3]	YUAN Lining, LIU Zhao. Graph representation learning by autoencoder with one-shot aggregation [J]. Journal of Computer Applications, 2023, 43(1): 8-14.
[4]	Yuyu MENG, Jing GUO. Link prediction algorithm based on information entropy improved PCA model [J]. Journal of Computer Applications, 2022, 42(9): 2823-2829.
[5]	Xiaopeng YU, Ruhan HE, Jin HUANG, Junjie ZHANG, Xinrong HU. Knowledge graph embedding model based on improved Inception structure [J]. Journal of Computer Applications, 2022, 42(4): 1065-1071.
[6]	Guangfu CHEN, Haibo WANG, Yanping LIAN. Link prediction in directed network based on high-order self-included collaborative filtering [J]. Journal of Computer Applications, 2022, 42(10): 3060-3068.
[7]	CAI Biao, LI Ruicen, WU Yuanyuan. Impact and enhancement of similarity features on link prediction [J]. Journal of Computer Applications, 2021, 41(9): 2569-2577.
[8]	ZHANG Yuanjun, ZHANG Xihuang. Dynamic network representation learning model based on graph convolutional network and long short-term memory network [J]. Journal of Computer Applications, 2021, 41(7): 1857-1864.
[9]	Huibo LI, Yunxiao ZHAO, Liang BAI. Dynamic graph representation learning method based on deep neural network and gated recurrent unit [J]. Journal of Computer Applications, 2021, 41(12): 3432-3437.
[10]	Pengfei QI, Lihua ZHOU, Guowang DU, Hao HUANG, Tong HUANG. Clustering-based hyperlink prediction [J]. Journal of Computer Applications, 2020, 40(2): 434-440.
[11]	LIU Yuyang, LI Longjie, SHAN Na, CHEN Xiaoyun. Link prediction method fusing clustering coefficients [J]. Journal of Computer Applications, 2020, 40(1): 28-35.
[12]	YANG Yanlin, YE Zhonglin, ZHAO Haixing, MENG Lei. Link prediction algorithm based on high-order proximity approximation [J]. Journal of Computer Applications, 2019, 39(8): 2366-2373.
[13]	WANG Wentao, WU Lintao, HUANG Ye, ZHU Rongbo. Link prediction model based on densely connected convolutional network [J]. Journal of Computer Applications, 2019, 39(6): 1632-1638.
[14]	DING Chao, ZHAO Hai, SI Shuaizong, ZHU Jian. Evolution model of normal aging human brain functional network [J]. Journal of Computer Applications, 2019, 39(4): 963-971.
[15]	ZOU Tengkuan, WANG Yuying, WU Chengrong. Review of network background traffic classification and identification [J]. Journal of Computer Applications, 2019, 39(3): 802-811.

Network representation learning algorithm based on improved random walk

基于改进随机游走的网络表示学习算法

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics