Journal of Computer Applications ›› 2017, Vol. 37 ›› Issue (8): 2234-2239.DOI: 10.11772/j.issn.1001-9081.2017.08.2234

Previous Articles     Next Articles

Link prediction algorithm based on network representation learning and random walk

LIU Si1, LIU Hai1,2, CHEN Qimai1, HE Chaobo3   

  1. 1. School of Computer, South China Normal University, Guangzhou Guangdong 510631, China;
    2. Guangdong Provincial Key Laboratory of High Performance Computing, Guangzhou Guangdong 510033, China;
    3. School of Information Science and Technology, Zhongkai University of Agriculture and Engineering, Guangzhou Guangdong 510225, China
  • Received:2016-12-29 Revised:2017-02-09 Online:2017-08-10 Published:2017-08-12
  • Supported by:
    This work is partially supported by the Natural Science Foundation of Guangdong Province (2016A030313441),the Science and Technology Planning Project of Guangdong Province (2015B010129009,2016A030303058,2016A090922008,2015A020209178),the Open Project Program of Guangdong Provincial Key Laboratory of High Performance Computing (T191527),the Science and Technology Program of Guangzhou (201604016035).

基于网络表示学习与随机游走的链路预测算法

刘思1, 刘海1,2, 陈启买1, 贺超波3   

  1. 1. 华南师范大学 计算机学院, 广州 510631;
    2. 广东省高性能计算重点实验室, 广州 510033;
    3. 仲恺农业工程学院 信息科学与技术学院, 广州 510225
  • 通讯作者: 刘海
  • 作者简介:刘思(1992-),男,江西丰城人,硕士研究生,CCF会员,主要研究方向:数据挖掘、大数据处理;刘海(1974-),男,湖南张家界人,副教授,博士,CCF会员,主要研究方向:文本挖掘、深度学习;陈启买(1965-),男,湖南衡阳人,教授,硕士,主要研究方向:数据挖掘、机器学习;贺超波(1981-),男,广东河源人,副教授,博士,CCF高级会员,主要研究方向:数据挖掘、社会计算。
  • 基金资助:
    广东省自然科学基金自由申请项目(2016A030313441);广东省科技计划项目(2015B010129009,2016A030303058,2016A090922008,2015A020209178);广东省高性能计算重点实验室开放课题项目(T191527);广州市科技计划项目(201604016035)。

Abstract: The transition process of existing link prediction indexes based on random walk exists strong randomness in the unweighted network and does not consider the effect of the similarity of the network structure between different neighbor nodes on transition probability. In order to solve the problems, a new link prediction algorithm based on network representation learning and random walk was proposed. Firstly, the latent structure features of network node were learnt by DeepWalk which is a network representation learning algorithm based on deep learning, and the network nodes were encoded into low-dimensional vector space. Secondly, the similarity between neighbor nodes in vector space was incorporated into the transition process of Random Walk with Restart (RWR) and Local Random Walk (LRW) respectively and the transition probability of each random walk was redefined. Finally, a large number of experiments on five real datasets were carried out. The experimental results show that the AUC (Area Under the receiver operating characteristic Curve) values of the proposed algorithms are improved up to 3.34% compared with eight representative link prediction benchmarks based on network structure.

Key words: link prediction, similarity, Random Walk with Restart (RWR), Local Random Walk (LRW), network representation learning

摘要: 现有的基于随机游走链路预测指标在无权网络上的转移过程存在较强随机性,没有考虑在网络结构上不同邻居节点间的相似性对转移概率的作用。针对此问题,提出一种基于网络表示学习与随机游走的链路预测算法。首先,通过基于深度学习的网络表示学习算法——DeepWalk学习网络节点的潜在结构特征,将网络中的各节点表征到低维向量空间;然后,在重启随机游走(RWR)和局部随机游走(LRW)算法的随机游走过程中融合各邻居节点在向量空间上的相似性,重新定义出邻居节点间的转移概率;最后,在5个真实数据集上进行大量实验验证。实验结果表明:相比8种具有代表性的基于网络结构的链路预测基准算法,所提算法链路预测结果的AUC值均有提升,最高达3.34%。

关键词: 链路预测, 相似性, 重启随机游走, 局部随机游走, 网络表示学习

CLC Number: