Journal of Computer Applications ›› 2022, Vol. 42 ›› Issue (9): 2643-2651.DOI: 10.11772/j.issn.1001-9081.2021071354

Special Issue: 人工智能

• Artificial intelligence •     Next Articles

Semi-supervised representation learning method combining graph auto-encoder and clustering

Hangyuan DU1, Sicong HAO1, Wenjian WANG1,2()   

  1. 1.School of Computer and Information Technology,Shanxi University,Taiyuan Shanxi 030006,China
    2.Key Laboratory Computational Intelligence and Chinese Information Processing of Ministry of Education (Shanxi University),Taiyuan Shanxi 030006,China
  • Received:2021-07-28 Revised:2021-10-18 Accepted:2021-10-21 Online:2021-11-10 Published:2022-09-10
  • Contact: Wenjian WANG
  • About author:DU Hangyuan, born in 1985, Ph. D., associate professor. His research interests include cluster analysis, complex network.
    HAO Sicong, born in 1995, M. S. candidate. Her research interests include machine learning, network data mining.
  • Supported by:
    National Natural Science Foundation of China(61902227);Scientific and Technological Innovation Program of Higher Education Institutions in Shanxi Province(2019L0039);Natural Science Foundation of Shanxi Province(201901D211192)


杜航原1, 郝思聪1, 王文剑1,2()   

  1. 1.山西大学 计算机与信息技术学院,太原 030006
    2.计算智能与中文信息处理教育部重点实验室(山西大学),太原 030006
  • 通讯作者: 王文剑
  • 作者简介:杜航原(1985—),男,山西太原人,副教授,博士,CCF会员,主要研究方向:聚类分析、复杂网络;
  • 基金资助:


Node label is widely existed supervision information in complex networks, and it plays an important role in network representation learning. Based on this fact, a Semi-supervised Representation Learning method combining Graph Auto-Encoder and Clustering (GAECSRL) was proposed. Firstly, the Graph Convolutional Network (GCN) and inner product function were used as the encoder and the decoder respectively, and the graph auto-encoder was constructed to form an information dissemination framework. Then, the k-means clustering module was added to the low-dimensional representation generated by the encoder, so that the training process of the graph auto-encoder and the category classification of the nodes were used to form a self-supervised mechanism. Finally, the category classification of the low-dimensional representation of the network was guided by using the discriminant information of the node labels. The network representation generation, category classification, and the training of the graph auto-encoder were built into a unified optimization model, and an effective network representation result that integrates node label information was obtained. In the simulation experiment, the GAECSRL method was used for node classification and link prediction tasks. Experimental results show that compared with DeepWalk, node2vec, learning Graph Representations with global structural information (GraRep), Structural Deep Network Embedding (SDNE) and Planetoid (Predicting labels and neighbors with embeddings transductively or inductively from data), GAECSRL has the Micro?F1 index increased by 0.9 to 24.46 percentage points, and the Macro?F1 index increased by 0.76 to 24.20 percentage points in the node classification task; in the link prediction task, GAECSRL has the AUC (Area under Curve) index increased by 0.33 to 9.06 percentage points, indicating that the network representation results obtained by GAECSRL effectively improve the performance of node classification and link prediction tasks.

Key words: network representation learning, network embedding, node label, graph neural network, self-supervised mechanism



关键词: 网络表示学习, 网络嵌入, 节点标签, 图神经网络, 自监督机制

CLC Number: