基于数据增强的子图感知对比学习

doi:10.11772/j.issn.1001-9081.2025010110

《计算机应用》唯一官方网站

• • 下一篇

基于数据增强的子图感知对比学习

李玟¹,李开荣²,杨凯¹

1. 江苏省扬州市邗江区华扬西路196号扬州大学
2. 扬州大学

收稿日期:2025-02-07 修回日期:2025-04-01 发布日期:2025-04-27 出版日期:2025-04-27
通讯作者: 李开荣
基金资助:
国家自然科学基金

Subgraph-aware contrastive learning with data augmentation

Received:2025-02-07 Revised:2025-04-01 Online:2025-04-27 Published:2025-04-27

摘要/Abstract

摘要： 图神经网络(GNN)是处理图结构数据的有效图表示方法。然而在实际应用中，GNN的性能受限于信息缺失问题。一方面，图结构通常较为稀疏，导致模型难以充分学习节点特征；另一方面，监督学习依赖的标签数据通常稀缺，使得模型训练受限，难以获得鲁棒的节点表示。针对以上问题，提出了一种基于数据增强的子图感知对比学习模型SCLDA。首先，通过链路预测学习原始图得出节点之间的关系得分，并将得分最高的边添加到原始图中以生成增强图；其次，对原始图和增强图分别利用目标节点进行采样局部子图，将子图的目标节点输入共享GNN编码器，生成子图级别的目标节点嵌入；最后，基于两个视角子图的目标节点的对比学习来最大化相似实例之间的互信息。在六个公共数据集Cora、Citeseer、Pubmed、Cora_ML、DBLP和Photo上进行节点分类实验，SCLDA比传统GCN模型的准确率分别提升了约4.4%、6.3%、4.5%、7.0%、13.2%和9.3%。

关键词: 图表示学习, 图神经网络, 数据增强, 自监督学习, 图对比学习, 节点分类

Abstract: Graph Neural Network (GNN) is an effective graph representation for processing graph structure data. However, the performance of GNN in practical applications was limited by the problem of missing information. On the one hand, the graph structure was usually sparse, making it difficult for the model to adequately learn node features. On the other hand, model training was limited by the fact that labels data, on which supervised learning relies, were often scarce, making it difficult to obtain robust node representations. To address these problems, a Subgraph-aware Contrastive Learning with Data Augmentation (SCLDA) model is proposed. Firstly, the relationship scores between nodes are derived by learning the original graph through link prediction, and the edges with the highest scores are added to the original graph to generate the enhanced graph. Secondly, local subgraphs are sampled using target nodes for the original and enhanced graphs, respectively, and the target nodes of subgraphs are input to the shared GNN encoder to generate target node embeddings at the subgraph level. Finally, mutual information between similar instances are maximized based on contrastive learning of the target nodes from the two perspective subgraphs. Experiments on node classification on six public datasets Cora, Citeseer, Pubmed, Cora_ML, DBLP, and Photo show that SCLDA improves the accuracy over the traditional GCN model by about 4.4%, 6.3%, 4.5%, 7.0%, 13.2% and 9.3%, respectively.

Key words: graph representation learning, graph neural network, data augmentation, self-supervised learning, graph contrastive learning, node classification

中图分类号:

TP18

李玟李开荣杨凯. 基于数据增强的子图感知对比学习[J]. 计算机应用, DOI: 10.11772/j.issn.1001-9081.2025010110.

[1]	朱俊屹, 常雷雷, 徐晓滨, 郝智勇, 于海跃, 姜江. 基于最小先验知识的自监督学习方法[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1035-1041.
[2]	游兰, 张雨昂, 刘源, 陈智军, 王伟, 曾星, 何张玮. 基于协作贡献网络的开源项目开发者推荐[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1213-1222.
[3]	党伟超, 温鑫瑜, 高改梅, 刘春霞. 基于多视图多尺度对比学习的图协同过滤[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1061-1068.
[4]	杨光局, 罗天健, 王开军, 杨思琪. 多分支多视图的时间序列上下文对比表征学习方法[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1042-1052.
[5]	田仁杰, 景明利, 焦龙, 王飞. 基于混合负采样的图对比学习推荐算法[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1053-1060.
[6]	孙海涛, 林佳瑜, 梁祖红, 郭洁. 结合标签混淆的中文文本分类数据增强技术[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1113-1119.
[7]	王聪, 史艳翠. 基于多视角学习的图神经网络群组推荐模型[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1205-1212.
[8]	盛坤, 王中卿. 基于大语言模型和数据增强的通感隐喻分析[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 794-800.
[9]	孙晨伟, 侯俊利, 刘祥根, 吕建成. 面向工程图纸理解的大语言模型提示生成方法[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 801-807.
[10]	富坤, 应世聪, 郑婷婷, 屈佳捷, 崔静远, 李建伟. 面向小样本节点分类的图数据增强方法[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 392-402.
[11]	马汉达, 吴亚东. 多域时空层次图神经网络的空气质量预测[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 444-452.
[12]	蔡启健, 谭伟. 语义图增强的多模态推荐算法[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 421-427.
[13]	严雪文, 黄章进. 基于对比学习的小样本图像分类方法[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 383-391.
[14]	余肖生, 王智鑫. 基于多层次图对比学习的序列推荐模型[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 106-114.
[15]	程子栋, 李鹏, 朱枫. 物联网威胁情报知识图谱中潜在关系的挖掘[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 24-31.

基于数据增强的子图感知对比学习

Subgraph-aware contrastive learning with data augmentation

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics