《计算机应用》唯一官方网站 ›› 2024, Vol. 44 ›› Issue (11): 3327-3334.DOI: 10.11772/j.issn.1001-9081.2023101526
收稿日期:2023-11-18
									
				
											修回日期:2024-02-01
									
				
											接受日期:2024-02-05
									
				
											发布日期:2024-11-13
									
				
											出版日期:2024-11-10
									
				
			通讯作者:
					蔡彪
							作者简介:胡能兵(1998—),男,四川成都人,硕士研究生,CCF会员,主要研究方向:图神经网络、图对比学习基金资助:
        
                                                                                                                            Nengbing HU, Biao CAI( ), Xu LI, Danhua CAO
), Xu LI, Danhua CAO
			  
			
			
			
                
        
    
Received:2023-11-18
									
				
											Revised:2024-02-01
									
				
											Accepted:2024-02-05
									
				
											Online:2024-11-13
									
				
											Published:2024-11-10
									
			Contact:
					Biao CAI   
							About author:HU Nengbing, born in 1998, M. S. candidate. His research interests include graph neural networks, graph contrast learning.摘要:
在图分类任务中,现有的利用丢弃节点的图池化算法得到的图嵌入表示没有有效地利用丢弃节点蕴含的信息和图间节点信息,同时传统方法也没有针对图嵌入进行单独学习,限制了它在图分类任务上的部分性能。为克服上述传统方法的不足,提出一种有效利用丢弃节点信息的图嵌入方法——基于图池化对比学习的图分类方法(GPCL)。首先,利用图注意力机制学习每个节点相应的注意力分数,且根据注意力分数对节点进行排序并丢弃分数较低的节点;其次,将本图保留的节点作为正样本,将其他图被丢弃的部分节点作为负样本,而将图的嵌入表达作为目标节点,两两计算相似性分数,从而进行对比学习。实验结果表明:在D&D (Dobson PD-Doig AJ)、MUTAG、PROTEINS和IMDB-B数据集上,相较于仅使用注意力机制和分层池化的方法,GPCL在图分类任务上的准确率分别提升了5.79、15.54、5.42和1.75个百分点,验证了GPCL充分提高了图间信息的利用率,在图分类任务上表现良好。
中图分类号:
胡能兵, 蔡彪, 李旭, 曹旦华. 基于图池化对比学习的图分类方法[J]. 计算机应用, 2024, 44(11): 3327-3334.
Nengbing HU, Biao CAI, Xu LI, Danhua CAO. Graph classification method based on graph pooling contrast learning[J]. Journal of Computer Applications, 2024, 44(11): 3327-3334.
| 数据集 | 图数 | 类别数 | 平均节点数 | 平均边数 | 
|---|---|---|---|---|
| D&D | 1 178 | 2 | 284.32 | 1 431.30 | 
| MUTAG | 188 | 2 | 17.93 | 19.79 | 
| PROTEINS | 1 113 | 2 | 39.06 | 72.82 | 
表1 生物化学数据集的统计信息
Tab. 1 Biochemical dataset statistics
| 数据集 | 图数 | 类别数 | 平均节点数 | 平均边数 | 
|---|---|---|---|---|
| D&D | 1 178 | 2 | 284.32 | 1 431.30 | 
| MUTAG | 188 | 2 | 17.93 | 19.79 | 
| PROTEINS | 1 113 | 2 | 39.06 | 72.82 | 
| 数据集 | 图数 | 类别数 | 平均节点数 | 平均边数 | 
|---|---|---|---|---|
| IMDB-B | 1 000 | 2 | 19.77 | 96.53 | 
| IMDB-M | 1 500 | 3 | 13.00 | 65.94 | 
表2 社交网络数据集的统计信息
Tab. 2 Social network dataset statistics
| 数据集 | 图数 | 类别数 | 平均节点数 | 平均边数 | 
|---|---|---|---|---|
| IMDB-B | 1 000 | 2 | 19.77 | 96.53 | 
| IMDB-M | 1 500 | 3 | 13.00 | 65.94 | 
| 超参数 | 取值范围 | 
|---|---|
| 学习率 | 0.000 1,0.000 5,0.001 | 
| 隐藏层维度 | 64,128 | 
| 池化率 | 0.4,0.5,0.6 | 
表3 文献[10]推荐的超参数取值范围
Tab. 3 Hyperparameter value ranges recommended in literature [10]
| 超参数 | 取值范围 | 
|---|---|
| 学习率 | 0.000 1,0.000 5,0.001 | 
| 隐藏层维度 | 64,128 | 
| 池化率 | 0.4,0.5,0.6 | 
| 方法 | 生物化学数据集 | 社交网络数据集 | |||
|---|---|---|---|---|---|
| D&D | MUTAG | PROTEINS | IMDB-B | IMDB-M | |
| SAGPooL_g | 71.54±0.91 | 76.78±2.12 | 72.02±1.08 | 72.16±0.88 | 49.47±0.56 | 
| SAGPooL_h | 74.72±0.82 | 73.67±4.28 | 71.56±1.49 | 72.55±1.28 | 50.23±0.44 | 
| ASAP | 76.58±1.04 | 77.83±1.49 | 77.83±1.49 | 72.81±0.50 | 50.78±0.75 | 
| MinCutPool | 78.22±0.54 | 79.17±1.64 | 74.72±0.48 | 72.65±0.75 | 51.04±0.70 | 
| Sequ2Sequ | 71.94±0.56 | 69.89±1.94 | 73.27±0.85 | 72.90±0.75 | 50.19±0.39 | 
| StructPool | 78.45±0.40 | 79.50±1.75 | 75.16±0.86 | 72.06±0.64 | 50.23±0.53 | 
| SortPool | 75.58±0.72 | 71.94±3.55 | 73.17±0.88 | 72.12±1.12 | 48.18±0.83 | 
| GMT | 78.72±0.59 | 83.44±1.33 | 75.09±0.59 | 73.48±0.76 | 50.66±0.82 | 
| SEP-G | 77.98±0.57 | 85.56±1.09 | 76.42±0.39 | 74.12±0.56 | 51.53±0.65 | 
| GPCL | 80.51±0.86 | 89.21±1.47 | 76.98±0.72 | 74.30±0.68 | 49.57±0.61 | 
表4 GPCL和其他方法的准确率±标准差对比 ( %)
Tab. 4 Comparison of accuracy ± standard deviation between GPCL and other methods
| 方法 | 生物化学数据集 | 社交网络数据集 | |||
|---|---|---|---|---|---|
| D&D | MUTAG | PROTEINS | IMDB-B | IMDB-M | |
| SAGPooL_g | 71.54±0.91 | 76.78±2.12 | 72.02±1.08 | 72.16±0.88 | 49.47±0.56 | 
| SAGPooL_h | 74.72±0.82 | 73.67±4.28 | 71.56±1.49 | 72.55±1.28 | 50.23±0.44 | 
| ASAP | 76.58±1.04 | 77.83±1.49 | 77.83±1.49 | 72.81±0.50 | 50.78±0.75 | 
| MinCutPool | 78.22±0.54 | 79.17±1.64 | 74.72±0.48 | 72.65±0.75 | 51.04±0.70 | 
| Sequ2Sequ | 71.94±0.56 | 69.89±1.94 | 73.27±0.85 | 72.90±0.75 | 50.19±0.39 | 
| StructPool | 78.45±0.40 | 79.50±1.75 | 75.16±0.86 | 72.06±0.64 | 50.23±0.53 | 
| SortPool | 75.58±0.72 | 71.94±3.55 | 73.17±0.88 | 72.12±1.12 | 48.18±0.83 | 
| GMT | 78.72±0.59 | 83.44±1.33 | 75.09±0.59 | 73.48±0.76 | 50.66±0.82 | 
| SEP-G | 77.98±0.57 | 85.56±1.09 | 76.42±0.39 | 74.12±0.56 | 51.53±0.65 | 
| GPCL | 80.51±0.86 | 89.21±1.47 | 76.98±0.72 | 74.30±0.68 | 49.57±0.61 | 
| 方法 | 生物化学数据集 | 社交网络数据集 | |||
|---|---|---|---|---|---|
| D&D | MUTAG | PROTEINS | IMDB-B | IMDB-M | |
| GCN | 72.05±0.55 | 69.50±1.78 | 73.24±0.73 | 73.26±0.46 | 50.39±0.41 | 
| GCN+POOL | 71.54±0.91 | 76.78±2.12 | 72.02±1.08 | 72.16±0.88 | 49.47±0.56 | 
| GPCL | 80.51±0.86 | 89.21±1.47 | 76.98±0.72 | 74.30±0.68 | 49.57±0.61 | 
表5 消融实验的准确率±标准差结果 ( %)
Tab. 5 Accuracy ± standard deviation results of ablation experiments
| 方法 | 生物化学数据集 | 社交网络数据集 | |||
|---|---|---|---|---|---|
| D&D | MUTAG | PROTEINS | IMDB-B | IMDB-M | |
| GCN | 72.05±0.55 | 69.50±1.78 | 73.24±0.73 | 73.26±0.46 | 50.39±0.41 | 
| GCN+POOL | 71.54±0.91 | 76.78±2.12 | 72.02±1.08 | 72.16±0.88 | 49.47±0.56 | 
| GPCL | 80.51±0.86 | 89.21±1.47 | 76.98±0.72 | 74.30±0.68 | 49.57±0.61 | 
| 模型 | D&D | MUTAG | IMDB-B | 
|---|---|---|---|
| GPCL | 80.51 | 89.21 | 74.30 | 
| GPCLGCN-1layer | 78.98 | 88.89 | 70.20 | 
| GPCLGCN-2layer | 79.39 | 89.72 | 71.80 | 
| AblationGCN-1layer | 76.83 | 76.82 | 71.45 | 
| AblationGCN-2layer | 76.68 | 77.54 | 71.34 | 
| GPCLGAT | 79.40 | 90.45 | 73.62 | 
| AblationGAT | 75.49 | 80.26 | 73.50 | 
| GPCLSage | 79.94 | 88.63 | 74.01 | 
| AblationSage | 76.28 | 76.79 | 74.25 | 
表6 GPCL的不同消息汇聚方式的准确率 ( %)
Tab. 6 Accuracies of different message aggregation methods in GPCL
| 模型 | D&D | MUTAG | IMDB-B | 
|---|---|---|---|
| GPCL | 80.51 | 89.21 | 74.30 | 
| GPCLGCN-1layer | 78.98 | 88.89 | 70.20 | 
| GPCLGCN-2layer | 79.39 | 89.72 | 71.80 | 
| AblationGCN-1layer | 76.83 | 76.82 | 71.45 | 
| AblationGCN-2layer | 76.68 | 77.54 | 71.34 | 
| GPCLGAT | 79.40 | 90.45 | 73.62 | 
| AblationGAT | 75.49 | 80.26 | 73.50 | 
| GPCLSage | 79.94 | 88.63 | 74.01 | 
| AblationSage | 76.28 | 76.79 | 74.25 | 
| 1 | LAZER D, PENTLAND A, ADAMIC L, et al. Computational social science[J]. Science, 2009, 323(5915): 721-723. | 
| 2 | DUVENAUD D K, MACLAURIN D, AGUILERA-IPARRAGUIRRE J, et al. Convolutional networks on graphs for learning molecular fingerprints[C]// Proceedings of the 28th International Conference on Neural Information Processing Systems — Volume 2. Cambridge: MIT Press, 2015: 2224-2232. | 
| 3 | BEPLER T, BERGER B. Learning the protein language: evolution, structure, and function[J]. Cell Systems, 2021, 12(6): 654-669. | 
| 4 | SHERVASHIDZE N, SCHWEITZER P, JAN E, et al. Weisfeiler-Lehman graph kernels[J]. The Journal of Machine Learning Research, 2011, 12(2011):2539-2561. | 
| 5 | BORGWARDT K M, KRIEGEL H P. Shortest-path kernels on graphs[C]// Proceedings of the 5th IEEE International Conference on Data Mining. Piscataway: IEEE, 2005: 1-8. | 
| 6 | SHERVASHIDZE N, VISHWANATHAN S V N, PETRI T H, et al. Efficient graphlet kernels for large graph comparison[C]// Proceedings of the 12th International Conference on Artificial Intelligence and Statistics. New York: JMLR, 2009: 488-495. | 
| 7 | MA Y, WANG S, AGGARWAL C C, et al. Graph convolutional networks with EigenPooling[C]// Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2019: 723-731. | 
| 8 | MAGNER A, BARANWAL M, HERO A O. Fundamental limits of deep graph convolutional networks for graph classification[J]. IEEE Transactions on Information Theory, 2022, 68(5): 3218-3233. | 
| 9 | BAEK J, KANG M, HWANG S J. Accurate learning of graph representations with graph multiset pooling[EB/OL]. [2023-09-15]. . | 
| 10 | LEE J, LEE I, KANG J. Self-attention graph pooling[C]// Proceedings of the 36th International Conference on Machine Learning. New York: JMLR, 2019: 3734-3743. | 
| 11 | RANJAN E, SANYAL S, TALUKDAR P. ASAP: adaptive structure aware pooling for learning hierarchical graph representations[C]// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2020: 5470-5477. | 
| 12 | YUAN H, JI S. StructPool: structured graph pooling via conditional random fields[EB/OL]. [2023-05-25].. | 
| 13 | NEUHAUS M, BUNKE H. Automatic learning of cost functions for graph edit distance[J]. Information Sciences, 2007, 177(1): 239-247. | 
| 14 | MORRIS C, RITZERT M, FEY M, et al. Weisfeiler and Leman go neural: higher-order graph neural networks[C]// Proceedings of the 33rd AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2019: 4602-4609. | 
| 15 | 王兆慧,沈华伟,曹婍,等.图分类研究综述[J].软件学报,2022,33(1):171-192. | 
| WANG Z H, SHEN H W, CAO Q, et al. Survey on graph classification[J]. Journal of Software, 2022, 33(1): 171-192. | |
| 16 | PASA L, NAVARIN N, ERB W, et al. Empowering simple graph convolutional networks[J]. IEEE Transactions on Neural Networks and Learning Systems, 2024, 35(4): 4385-4399. | 
| 17 | CAI B, ZHU X, QIN Y. Parameters optimization of hybrid strategy recommendation based on particle swarm algorithm[J]. Expert Systems with Applications, 2021, 168: No.114388. | 
| 18 | CAI B, YANG X, HUANG Y, et al. A triangular personalized recommendation algorithm for improving diversity[J]. Discrete Dynamics in Nature and Society, 2018, 2018(1): No.3162068. | 
| 19 | WU L, XIA J, GAO Z, et al. GraphMixup: improving class-imbalanced node classification by reinforcement mixup and self-supervised context prediction[C]// Proceedings of the 2022 Joint European Conference on Machine Learning and Knowledge Discovery in Databases, LNCS 13716. Cham: Springer, 2023: 519-535. | 
| 20 | KIPF T N, WELLING M. Semi-supervised classification with graph convolutional networks[EB/OL]. [2023-10-13].. | 
| 21 | VINYALS O, BENGIO S, KUDLUR M. Order matters: sequence to sequence for sets[EB/OL]. [2023-09-01].. | 
| 22 | ZHANG M, CUI Z, NEUMANN M, et al. An end-to-end deep learning architecture for graph classification[C]// Proceedings of the 32nd AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2018: 4438-4445. | 
| 23 | YING R, YOU J, MORRIS C, et al. Hierarchical graph representation learning with differentiable pooling[C]// Proceedings of the 32nd International Conference on Neural Information Processing Systems. Red Hook, NY: Curran Associates Inc., 2018: 4805-4815. | 
| 24 | VELIČKOVIĆ P, CUCURULL G, CASANOVA A, et al. Graph attention networks[EB/OL]. [2023-08-04].. | 
| 25 | YAN Z, CAO W, JI J. Social behavior prediction with graph U‑Net+[J]. Discover Internet of Things, 2021, 1: No.18. | 
| 26 | LeCUN Y. A theoretical framework for back-propagation[C]// Proceedings of the 1988 Connectionist Models Summer School. San Mateo: Morgan Kaufmann Publishers, Inc., 1989: 21-28. | 
| 27 | CANGEA C, VELIČKOVIĆ P, JOVANOVIĆ N, et al. Towards sparse hierarchical graph classifiers[EB/OL]. [2023-07-25].. | 
| 28 | FEY M, LENSSEN J E. Fast graph representation learning with PyTorch Geometric[EB/OL]. [2023-07-06].. | 
| 29 | PASZKE A, GROSS S, CHINTALA S, et al. Automatic differentiation in PyTorch[EB/OL]. [2023-06-19].. | 
| 30 | DOBSON P D, DOIG A J. Distinguishing enzyme structures from non-enzymes without alignments[J]. Journal of Molecular Biology, 2003, 330(4): 771-783. | 
| 31 | BORGWARDT K M, ONG C S, SCHÖNAUER S, et al. Protein function prediction via graph kernels[J]. Bioinformatics, 2005, 21(S1): i47-i56. | 
| 32 | DEBNATH A K, DE COMPADRE R L L, DEBNATH G, et al. Structure-activity relationship of mutagenic aromatic and heteroaromatic nitro compounds. Correlation with molecular orbital energies and hydrophobicity[J]. Journal of Medicinal Chemistry, 1991, 34(2): 786-797. | 
| 33 | YANARDAG P, VISHWANATHAN S V N. Deep graph kernels[C]// Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2015:1365-1374. | 
| 34 | SHCHUR O, MUMME M, BOJCHEVSKI A, et al. Pitfalls of graph neural network evaluation[EB/OL]. [2024-01-30].. | 
| 35 | KINGMA D P, BA J L. Adam: a method for stochastic optimization[EB/OL]. [2023-08-17].. | 
| 36 | BIANCHI F M, GRATTAROLA D, ALIPPI C. Spectral clustering with graph neural networks for graph pooling[C]// Proceedings of the 37th International Conference on Machine Learning. New York: JMLR, 2020: 874-883. | 
| 37 | WU J, CHEN X, XU K, et al. Structural entropy guided graph hierarchical pooling[C]// Proceedings of the 39th International Conference on Machine Learning. New York: JMLR, 2022: 24017-24030. | 
| This work is partially supported byFoundation: National Natural Science Foundation of China (61802034). | 
| [1] | 马汉达, 吴亚东. 多域时空层次图神经网络的空气质量预测[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 444-452. | 
| [2] | 蔡启健, 谭伟. 语义图增强的多模态推荐算法[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 421-427. | 
| [3] | 余肖生, 王智鑫. 基于多层次图对比学习的序列推荐模型[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 106-114. | 
| [4] | 程子栋, 李鹏, 朱枫. 物联网威胁情报知识图谱中潜在关系的挖掘[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 24-31. | 
| [5] | 赵文博, 马紫彤, 杨哲. 基于有向超图自适应卷积的链接预测模型[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 15-23. | 
| [6] | 杨航, 李汪根, 张根生, 王志格, 开新. 基于图神经网络的多层信息交互融合算法用于会话推荐[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2719-2725. | 
| [7] | 贾洁茹, 杨建超, 张硕蕊, 闫涛, 陈斌. 基于自蒸馏视觉Transformer的无监督行人重识别[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2893-2902. | 
| [8] | 杨兴耀, 陈羽, 于炯, 张祖莲, 陈嘉颖, 王东晓. 结合自我特征和对比学习的推荐模型[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2704-2710. | 
| [9] | 唐廷杰, 黄佳进, 秦进. 基于图辅助学习的会话推荐[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2711-2718. | 
| [10] | 杜郁, 朱焱. 构建预训练动态图神经网络预测学术合作行为消失[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2726-2731. | 
| [11] | 杨莹, 郝晓燕, 于丹, 马垚, 陈永乐. 面向图神经网络模型提取攻击的图数据生成方法[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2483-2492. | 
| [12] | 杨帆, 邹窈, 朱明志, 马振伟, 程大伟, 蒋昌俊. 基于图注意力Transformer神经网络的信用卡欺诈检测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2634-2642. | 
| [13] | 林欣蕊, 王晓菲, 朱焱. 基于局部扩展社区发现的学术异常引用群体检测[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1855-1861. | 
| [14] | 汪炅, 唐韬韬, 贾彩燕. 无负采样的正样本增强图对比学习推荐方法PAGCL[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1485-1492. | 
| [15] | 郭洁, 林佳瑜, 梁祖红, 罗孝波, 孙海涛. 基于知识感知和跨层次对比学习的推荐方法[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1121-1127. | 
| 阅读次数 | ||||||
| 全文 |  | |||||
| 摘要 |  | |||||