计算机应用

• 人工智能与仿真 •    下一篇

结合图自编码器与聚类的半监督表示学习方法

杜航原,郝思聪,王文剑   

  1. 山西大学
  • 收稿日期:2021-07-28 修回日期:2021-10-18 发布日期:2021-11-10 出版日期:2021-11-10
  • 通讯作者: 郝思聪

Semi-supervised representation learning method combining graph autoencoder and clustering

  • Received:2021-07-28 Revised:2021-10-18 Online:2021-11-10 Published:2021-11-10

摘要: 摘 要: 节点标签是复杂网络中广泛存在的监督信息,对网络表示学习具有重要作用。针对这一问题,提出了一种结合图自编码器与聚类的半监督表示学习方法(GAECSRL)。首先,以图卷积网络和内积函数分别作为编码器和解码器,构建图自编码器形成信息传播框架;然后,在编码器生成的低维表示基础上叠加k-means聚类模块,使图自编码器的训练过程和节点的类别分布划分形成自监督机制;最后,利用节点标签的判别信息对网络低维表示的类别划分进行指导。将网络表示生成、类别划分以及图自编码器的训练构建在一个统一的优化模型中,最终能够获得融合节点标签信息的有效网络表示结果。在仿真实验中,将GAECSRL方法用于节点分类和链接预测任务。实验结果表明,相比DeepWalk、node2vec、GraRep、SDNE和Planetoid,GAECSRL在节点分类任务中Micro-F1指标提高了0.9~24.46个百分点,Macro-F1指标提高了0.76~24.20个百分点;在链接预测任务中,AUC指标提高了0.33~9.06个百分点,说明GAECSRL获得的网络表示结果能有效提高节点分类和链接预测任务性能。

Abstract: Abstract: Node label is a widely existed supervision information in complex networks, and it plays an important role in network representation learning. To solve this problem, a Semi-supervised Representation Learning method combining Graph Auto-Encoder and Clustering (GAECSRL) was proposed. First, the graph convolutional network and the inner product function were used as the encoder and the decoder respectively, and the graph auto-encoder was constructed to form an information dissemination framework; then, the k-means clustering module was superimposed on the low-dimensional representation generated by the encoder, so that the training process of the graph auto-encoder and the classification of the node's category distribution formed a self-supervised mechanism; finally, the classification information of the low-dimensional representation of the network was guided by using the discriminant information of the node labels. The network representation generation, category division, and the training of the graph auto-encoder were built into a unified optimization model, and finally an effective network representation result that integrates node label information could be obtained. In the simulation experiment, the GAECSRL method was used for node classification and link prediction tasks. The experimental results show that, compared with DeepWalk, node2vec, GraRep(Learning Graph Representations with Global Structural Information), SDNE (Structural Deep Network Embedding) and Planetoid (Predicting Labels and Neighbors with Embeddings Transductively Or Inductively from Data), the Micro-F1 index of GAECSRL in the node classification task increased by 0.9~24.46 percentage points, and the Macro-F1 index increased by 0.76~24.20 percentage points; in the link prediction task , the AUC index increased by 0.33~9.06 percentage points, indicating that the network representation results obtained by GAECSRL can effectively improve the performance of node classification and link prediction tasks.

中图分类号: