Journal of Computer Applications ›› 2019, Vol. 39 ›› Issue (4): 1012-1020.DOI: 10.11772/j.issn.1001-9081.2018081851

Previous Articles     Next Articles

Network representation learning algorithm incorporated with node profile attribute information

LIU Zhengming, MA Hong, LIU Shuxin, LI Haitao, CHANG Sheng   

  1. National Digital Switching System Engineering & Technological Research Center, Zhengzhou Henan 450002, China
  • Received:2018-09-06 Revised:2018-11-16 Online:2019-04-10 Published:2019-04-10
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (61521003, 61803384).

融合节点描述属性信息的网络表示学习算法

刘正铭, 马宏, 刘树新, 李海涛, 常圣   

  1. 国家数字交换系统工程技术研究中心, 郑州 450002
  • 通讯作者: 刘正铭
  • 作者简介:刘正铭(1995-),男,四川南充人,硕士研究生,CCF会员,主要研究方向:网络大数据分析、网络表示学习;马宏(1968-),男,江苏东台人,研究员,硕士,主要研究方向:社会网络分析、电信网关防;刘树新(1987-),男,山东临朐人,助理研究员,博士,主要研究方向:复杂网络、网络数据挖掘;李海涛(1982-),男,山东泰安人,讲师,硕士,主要研究方向:网络数据挖掘;常圣(1988-),男,河南郑州人,硕士研究生,主要研究方向:网络数据挖掘。
  • 基金资助:
    国家自然科学基金资助项目(61521003,61803384)。

Abstract: In order to enhance the network representation learning quality with node profile information, and focus on the problems of semantic information dispersion and incompleteness of node profile attribute information in social network, a network representation learning algorithm incorporated with node profile information was proposed, namely NPA-NRL. Firstly, attribute information were encoded by one-hot encoding, and a data augmentation method of random perturbation was introduced to overcome the incompleteness of node profile attribute information. Then, attribute coding and structure coding were combined as the input of deep neural network to realize mutual complementation of the two types of information. Finally, an attribute similarity measure function based on network homogeneity and a structural similarity measure function based on SkipGram model were designed to mine fused semantic information through joint training. The experimental results on three real network datasets including GPLUS, OKLAHOMA and UNC demonstrate that, compared with the classic DeepWalk, Text-Associated DeepWalk (TADW), User Profile Preserving Social Network Embedding (UPP-SNE) and Social Network Embedding (SNE) algorithms, the proposed NPA-NRL algorithm has a 2.75% improvement in average Area Under Curve of ROC (AUC) value on link prediction task, and a 7.10% improvement in average F1 value on node classification task.

Key words: node profile attribute information, information fusion, network representation learning, deep learning, complex network

摘要: 为融合节点描述信息提升网络表示学习质量,针对社会网络中节点描述属性信息存在的语义信息分散和不完备性问题,提出一种融合节点描述属性的网络表示(NPA-NRL)学习算法。首先,对属性信息进行独热编码,并引入随机扰动的数据集增强策略解决属性信息不完备问题;然后,将属性编码和结构编码拼接作为深度神经网络输入,实现两方面信息的相互补充制约;最后,设计了基于网络同质性的属性相似性度量函数和基于SkipGram模型的结构相似性度量函数,通过联合训练实现融合语义信息挖掘。在GPLUS、OKLAHOMA和UNC三个真实网络数据集上的实验结果表明,和经典的DeepWalk、TADW(Text-Associated DeepWalk)、UPP-SNE(User Profile Preserving Social Network Embedding)和SNE(Social Network Embedding)算法相比,NPA-NRL算法的链路预测AUC(Area Under Curve of ROC)值平均提升2.75%,节点分类F1值平均提升7.10%。

关键词: 节点描述属性信息, 信息融合, 网络表示学习, 深度学习, 复杂网络

CLC Number: