计算机应用 ›› 2021, Vol. 41 ›› Issue (4): 1064-1070.DOI: 10.11772/j.issn.1001-9081.2020061006

所属专题: 人工智能

• 人工智能 • 上一篇    下一篇

基于自编码器的多视图属性网络表示学习模型

樊玮, 王慧敏, 邢艳   

  1. 中国民航大学 计算机科学与技术学院, 天津 300300
  • 收稿日期:2020-07-10 修回日期:2020-10-13 出版日期:2021-04-10 发布日期:2020-12-30
  • 作者简介:樊玮(1968—),男,陕西乾县人,教授,博士,CCF会员,主要研究方向:机器学习、智能信息处理;王慧敏(1996—),女,山西大同人,硕士研究生,主要研究方向:网络表示学习、复杂网络;邢艳(1987—),女,河北沧州人,讲师,博士,CCF会员,主要研究方向:数据挖掘、机器学习、复杂网络。
  • 基金资助:
    中央高校基本科研业务费专项资金资助项目(3122018C020)。

Auto-encoder based multi-view attributed network representation learning model

FAN Wei, WANG Huimin, XING Yan   

  1. College of Computer Science and Technology, Civil Aviation University of China, Tianjin 300300, China
  • Received:2020-07-10 Revised:2020-10-13 Online:2021-04-10 Published:2020-12-30
  • Supported by:
    This work is partially supported by the Fundamental Research Funds for the Central Universities (3122018C020).

摘要: 现有的大多数网络表示学习方法很难兼顾网络中丰富的结构信息和属性信息,导致其后续任务,如分类、聚类等的效果不佳。针对此问题,提出一种基于自编码器的多视图属性网络表示学习模型(AE-MVANR)。首先,将网络的拓扑结构信息转化为拓扑结构视图(TSV),通过计算节点间相同属性共现频率来构造属性结构视图(ASV);然后,在两个视图上分别利用随机游走算法得到若干节点序列;最后,经过自编码器训练得到的序列,从而得到融合了结构信息和属性信息的节点表示向量。在几个真实数据集上进行了分类、聚类任务的大量实验,结果表明,所提AE-MVANR优于常用的仅基于网络结构的和同时基于网络结构信息及节点属性信息的网络表示学习方法,具体来说该模型的分类准确率最高提升43.75%,而其聚类结果的标准化互信息(NMI)和轮廓系数(Silhouette Coefficient)指标最高增幅分别为137.95%和1 314.63%,戴维森堡丁指数(DBI)最大降幅达45.99%。

关键词: 网络表示学习, 网络嵌入, 节点表示向量, 多视图属性网络, 自编码器

Abstract: Most of the traditional network representation learning methods cannot consider the rich structure information and attribute information in the network at the same time, resulting in poor performance of subsequent tasks such as classification and clustering. In order to solve this problem, an Auto-Encoder based Multi-View Attributed Network Representation learning model(AE-MVANR) was proposed. Firstly, the topological structure information of the network was transformed into the Topological Structure View(TSV), and the co-occurrence frequencies of the same attributes between nodes were calculated to construct the Attributed Structure View(ASV). Then, the random walk algorithm was used to obtain a series of node sequences on two views separately. At last, by inputting all the generated sequences into an auto-encoder model for training, the node representation vectors that integrate structure information and attribute information were obtained. Extensive experiments of classification and clustering tasks on several real-world datasets were carried out. The results demonstrate that AE-MVANR outperforms the widely used network representation learning method based solely on structure information and the one based on both network structure information and node attribute information. In specific, for classification results of the proposed model, the maximum increase of accuracy is 43.75%, and for clustering results of the proposed model, the maximum increase of Normalized Mutual Information(NMI) is 137.95%, the maximum increase of Silhouette Coefficient is 1 314.63% and the maximum decrease of Davies Bouldin Index(DBI) is 45.99%.

Key words: network representation learning, network embedding, node representation vector, multi-view attributed network, auto-encoder

中图分类号: