基于网络表示学习与随机游走的链路预测算法

doi:10.11772/j.issn.1001-9081.2017.08.2234

计算机应用 ›› 2017, Vol. 37 ›› Issue (8): 2234-2239.DOI: 10.11772/j.issn.1001-9081.2017.08.2234

基于网络表示学习与随机游走的链路预测算法

刘思¹, 刘海^1,2, 陈启买¹, 贺超波³

1. 华南师范大学计算机学院, 广州 510631;
2. 广东省高性能计算重点实验室, 广州 510033;
3. 仲恺农业工程学院信息科学与技术学院, 广州 510225

收稿日期:2016-12-29 修回日期:2017-02-09 发布日期:2017-08-12 出版日期:2017-08-10
通讯作者: 刘海
作者简介:刘思(1992-),男,江西丰城人,硕士研究生,CCF会员,主要研究方向:数据挖掘、大数据处理;刘海(1974-),男,湖南张家界人,副教授,博士,CCF会员,主要研究方向:文本挖掘、深度学习;陈启买(1965-),男,湖南衡阳人,教授,硕士,主要研究方向:数据挖掘、机器学习;贺超波(1981-),男,广东河源人,副教授,博士,CCF高级会员,主要研究方向:数据挖掘、社会计算。
基金资助:
广东省自然科学基金自由申请项目（2016A030313441）；广东省科技计划项目（2015B010129009，2016A030303058，2016A090922008，2015A020209178）；广东省高性能计算重点实验室开放课题项目（T191527）；广州市科技计划项目（201604016035）。

Link prediction algorithm based on network representation learning and random walk

LIU Si¹, LIU Hai^1,2, CHEN Qimai¹, HE Chaobo³

1. School of Computer, South China Normal University, Guangzhou Guangdong 510631, China;
2. Guangdong Provincial Key Laboratory of High Performance Computing, Guangzhou Guangdong 510033, China;
3. School of Information Science and Technology, Zhongkai University of Agriculture and Engineering, Guangzhou Guangdong 510225, China

Received:2016-12-29 Revised:2017-02-09 Online:2017-08-12 Published:2017-08-10
Supported by:
This work is partially supported by the Natural Science Foundation of Guangdong Province (2016A030313441),the Science and Technology Planning Project of Guangdong Province (2015B010129009,2016A030303058,2016A090922008,2015A020209178),the Open Project Program of Guangdong Provincial Key Laboratory of High Performance Computing (T191527),the Science and Technology Program of Guangzhou (201604016035).

摘要/Abstract

摘要： 现有的基于随机游走链路预测指标在无权网络上的转移过程存在较强随机性，没有考虑在网络结构上不同邻居节点间的相似性对转移概率的作用。针对此问题，提出一种基于网络表示学习与随机游走的链路预测算法。首先，通过基于深度学习的网络表示学习算法——DeepWalk学习网络节点的潜在结构特征，将网络中的各节点表征到低维向量空间；然后，在重启随机游走（RWR）和局部随机游走（LRW）算法的随机游走过程中融合各邻居节点在向量空间上的相似性，重新定义出邻居节点间的转移概率；最后，在5个真实数据集上进行大量实验验证。实验结果表明：相比8种具有代表性的基于网络结构的链路预测基准算法，所提算法链路预测结果的AUC值均有提升，最高达3.34%。

关键词: 链路预测, 相似性, 重启随机游走, 局部随机游走, 网络表示学习

Abstract: The transition process of existing link prediction indexes based on random walk exists strong randomness in the unweighted network and does not consider the effect of the similarity of the network structure between different neighbor nodes on transition probability. In order to solve the problems, a new link prediction algorithm based on network representation learning and random walk was proposed. Firstly, the latent structure features of network node were learnt by DeepWalk which is a network representation learning algorithm based on deep learning, and the network nodes were encoded into low-dimensional vector space. Secondly, the similarity between neighbor nodes in vector space was incorporated into the transition process of Random Walk with Restart (RWR) and Local Random Walk (LRW) respectively and the transition probability of each random walk was redefined. Finally, a large number of experiments on five real datasets were carried out. The experimental results show that the AUC (Area Under the receiver operating characteristic Curve) values of the proposed algorithms are improved up to 3.34% compared with eight representative link prediction benchmarks based on network structure.

Key words: link prediction, similarity, Random Walk with Restart (RWR), Local Random Walk (LRW), network representation learning

中图分类号:

TP391
TP18

刘思, 刘海, 陈启买, 贺超波. 基于网络表示学习与随机游走的链路预测算法[J]. 计算机应用, 2017, 37(8): 2234-2239.

LIU Si, LIU Hai, CHEN Qimai, HE Chaobo. Link prediction algorithm based on network representation learning and random walk[J]. Journal of Computer Applications, 2017, 37(8): 2234-2239.

参考文献

[1] 吕琳媛.复杂网络链路预测[J].电子科技大学学报,2010,39(5):651-661.(LYU L Y. Link prediction on complex networks[J]. Journal of University of Electronic Science and Technology, 2010, 39(5):651-661.)
[2] 胡文斌,彭超,梁欢乐,等.基于链路预测的社会网络事件检测方法[J].软件学报,2015,26(9):2239-2355.(HU W B, PENG C, LIANG H L, et al. Event detection method based on link prediction for social network evolution[J]. Journal of Software, 2015, 26(9):2339-2355.)
[3] LU L Y, ZHOU T. Link prediction in complex networks:a survey[J]. Physica A:Statistical Mechanics and Its Applications, 2011, 390(6):1150-1170.
[4] JACCARD P. Étude comparative de la distribution florale dans uneportion des Alpes et des Jura[J]. Bulletin del la Société Vaudoise des Sciences Naturelles, 1901, 37:547-579.
[5] ADAMIC L A, ADAR E. Friends and neighbors on the Web[J]. Social Networks, 2003, 25(3):211-230.
[6] BARABASI A L, ALBERT R. Emergence of scaling in random networks[J]. Science, 1999, 286(5439):509-512.
[7] ZHOU T, LU L Y, ZHANG Y C. Predicting missing links via local information[J]. The European Physical Journal Condensed Matter and Complex System, 2009, 71(4):623-630.
[8] LU L Y, JIN C H, ZHOU T, Similarity index based on local paths for link prediction of complex networks[J]. Physical Review E, 2009, 80(4):046122.
[9] KATZ L. A new status index derived from sociometric analysis[J]. Psychometrika, 1953, 18(1):39-43.
[10] LEICHT E A, HOLME P, NEWMAN M E J. Vertex similarity in networks[J]. Physical Review E, 2006, 73(2):026120.
[11] KLEIN D J, RANDIC M. Resistance distance[J]. Journal of Mathematical Chemistry, 1993, 12(1):81-95.
[12] TONG H, FALOUTSOS C, PAN J Y. Fast random walk with restart and its applications[C]//Proceedings of the 6th International Conference on Data Mining. Piscataway, NJ:IEEE, 2006:613-622.
[13] LIU W P, LU L Y. Link prediction based on local random walk[J]. Europhysics Letters, 2010, 89(5):58007.
[14] CHERRY A, ABBER E. Enhancing link prediction in twitter using semantic user attribute[C]//Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. New York:ACM, 2015:1155-1161.
[15] 陈维政,张岩,李晓明.网络表示学习[J].大数据,2015,1(3):8-22.(CHEN W Z, ZHANG Y, LI X M. Network representation learning[J]. Big Data Research, 2015, 1(3):8-22.)
[16] 孙志远,鲁成祥,史忠植,等.深度学习研究与进展[J].计算机科学,2016,43(2):1-8.(SUN Z Y, LU C X, SHI Z Z, et al. Research and advances on deep learning[J]. Computer Science, 2016, 43(2):1-8.)
[17] MIKOLOV T, SUTSKEVER I, CHEN K, et al. Distributed representations of words and phrases and their compositionality[C]//NIPS'13:Proceedings of the 26th International Conference on Neural Information Processing Systems. Red Hook, NY:Curran Associates Inc., 2013:3111-3119.
[18] PEROZZI B, AL-RFOU R, SKIENA S. DeepWalk:online learning of social representations[C]//Proceedings of the 20th ACM SIGKDD International Conference On Knowledge Discovery and Data Mining. New York:ACM, 2014:701-710.
[19] BRIN S, PAGE L. The anatomy of a large-scale hypertextual Web search engine[J]. Computer Networks and ISDN Systems, 1998, 30(1):107-117.
[20] YOSHUA B, REJEAN D, PASCAL V, et al. A neural probabilistic language model[J]. Journal of Machine Learning Research, 2003, 3(6):1137-1155.
[21] MORIN F, BENGIO Y. Hierarchical probabilistic neural network language model[C]//Proceedings of the 10th International Workshop Conference on Artificial Intelligence and Statistics. Cambridge, CA:MIT Press, 2005:246-252.
[22] HANLEY J A, MCNEIL B J. The meaning and use of the area under a Receiver Operating Characteristic (ROC) curve[J]. Radiology, 1982, 143(1):29-36.

基于网络表示学习与随机游走的链路预测算法

Link prediction algorithm based on network representation learning and random walk

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

[1]	黄海翔, 彭双和, 钟子煜. 基于用户系统调用序列的二进制代码识别[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2160-2167.
[2]	孙祥杰, 魏强, 王奕森, 杜江. 代码相似性检测技术综述[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1248-1258.
[3]	袁中臣, 马宗民. 基于UMCS树的UML类图的混合相似性度量[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 883-889.
[4]	徐凯, 高琦凯, 殷明, 谭京京. 基于三维空间面积划分的轨迹相似性度量算法[J]. 《计算机应用》唯一官方网站, 2024, 44(1): 318-323.
[5]	甘舰文, 陈艳, 周芃, 杜亮. 基于高阶一致性学习的聚类集成算法[J]. 《计算机应用》唯一官方网站, 2023, 43(9): 2665-2672.
[6]	陈俊韬, 朱子奇. 基于多尺度特征提取与融合的图像复制-粘贴伪造检测[J]. 《计算机应用》唯一官方网站, 2023, 43(9): 2919-2924.
[7]	周寅莹, 周允升, 余敦辉, 孙军. 基于消极相似性的自适应社会化推荐[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2439-2447.
[8]	王静红, 周志霞, 王辉, 李昊康. 双路自编码器的属性网络表示学习[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2338-2344.
[9]	梁军, 洪泽泓, 余松森. 基于改进粒子群优化算法和遗传变异的图像分割模型[J]. 《计算机应用》唯一官方网站, 2023, 43(6): 1743-1749.
[10]	魏兴慎, 高鹏, 吕卓, 曹永健, 周剑, 屈志昊. 基于自适应交互反馈的电力终端信任度评估机制[J]. 《计算机应用》唯一官方网站, 2023, 43(6): 1878-1883.
[11]	邱莲鹏, 宋承云. 噪声鲁棒的动态时间规整算法[J]. 《计算机应用》唯一官方网站, 2023, 43(6): 1855-1860.
[12]	周琳, 肖玉芝, 刘鹏, 秦有鹏. 基于节点多关系的社团挖掘算法及其应用[J]. 《计算机应用》唯一官方网站, 2023, 43(5): 1489-1496.
[13]	胡中波, 王旭鹏. 求解测试用例自动生成问题的多因子回溯搜索优化算法[J]. 《计算机应用》唯一官方网站, 2023, 43(4): 1214-1219.
[14]	富坤, 郝玉涵, 孙明磊, 刘赢华. 基于优化图结构自编码器的网络表示学习[J]. 《计算机应用》唯一官方网站, 2023, 43(10): 3054-3061.
[15]	孟昱煜, 郭静. 信息熵改进主成分分析模型的链路预测算法[J]. 《计算机应用》唯一官方网站, 2022, 42(9): 2823-2829.