Feature selection method for graph neural network based on network architecture design

doi:10.11772/j.issn.1001-9081.2023030353

Abstract

Abstract:

In recent years， researchers have proposed many improved model architecture designs for Graph Neural Network （GNN）， driving performance improvements in various prediction tasks. But most GNN variants start with the assumption that node features are equally important， which is not the case. To solve this problem， a feature selection method was proposed to improve the existing model and select important feature subsets for the dataset. The proposed method consists of two components， a feature selection layer， and a separate label-feature mapping. Softmax normalizer and feature “soft selector” were used for feature selection in the feature selection layer， and the model structure was designed under the idea of separate label-feature mapping to select the corresponding subsets of related features for different labels， and multiple related feature subsets were performed union operation to obtain an important feature subset of the final dataset. Graph ATtention network （GAT） and GATv2 models were selected as the benchmark models， and the algorithm was applied to the benchmark models to obtain new models. Experimental results show that when the proposed models perform node classification tasks on six datasets， their accuracies are improved by 0.83% - 8.79% compared with the baseline models. The new models also select the corresponding important feature subsets for the six datasets， in which the number of features accounts for 3.94% - 12.86% of the total number of features in their respective datasets. After using the important feature subset as the new input of the benchmark model， the accuracy more than 95% （using all features） is still achieved. That is， the scale of the model is reduced while ensuring the accuracy. It can be seen that the proposed new algorithm can improve the accuracy of node classification， and can effectively select the corresponding important feature subset for the dataset.

Key words: Graph Neural Network (GNN), Graph ATtention network (GAT), feature selection, node classification, deep learning

摘要：

近年来，研究人员针对图神经网络（GNN）提出了许多改进的模型架构设计，推动了各种预测任务的性能提升。但大多数GNN变体在开始都认为节点的特征同等重要，而实际情况并非如此。针对这个问题，提出一种特征选择方法来改进现有模型，并为数据集选择出重要特征子集。所提方法由特征选择层和标签-特征单独映射两个组件构成。在特征选择层中使用Softmax归一化器和特征“软选择器”进行特征选择，在标签-特征单独映射思想下设计模型结构，为不同的标签选择对应的相关特征子集，并将多个相关特征子集作集合并运算得到最终数据集的重要特征子集。选取图注意力网络（GAT）和GATv2模型为基准模型，将算法应用到基准模型中得到新模型。实验结果表明，所提模型在6个数据集上执行节点分类任务时，准确率相较于基准模型提升了0.83%~8.79%；新模型也为6个数据集选择了对应的重要特征子集，这些重要特征子集的特征数量占各自数据集总特征数的3.94%~12.86%，将重要特征子集作为基准模型的新输入后仍然获得了95%以上的准确率（使用了所有特征），即在保证准确率的基础上减小了模型的规模。可见，所提方法能够提高节点分类准确率，并有效地为数据集选择对应的重要特征子集。

关键词: 图神经网络, 图注意力网络, 特征选择, 节点分类, 深度学习

CLC Number:

TP183

Dapeng XU, Xinmin HOU. Feature selection method for graph neural network based on network architecture design[J]. Journal of Computer Applications, 2024, 44(3): 663-670.

徐大鹏, 侯新民. 基于网络结构设计的图神经网络特征选择方法[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 663-670.

Figures/Tables 9

Fig. 1 Structure of FSGNN model

Tab. 1 Statistics of experimental datasets

数据集	节点数	边数	特征维数	标签数	训练集节点数	验证集节点数	测试集节点数	同质比
Cora	2 708	5 429	1 433	7	140	500	1 000	0.81
Citeseer	3 327	4 732	3 703	6	120	500	1 000	0.74
Pubmed	19 717	44 338	500	3	60	500	1 000	0.80
Cornell	183	295	1 703	5	87	59	37	0.30
Texas	183	309	1 703	5	87	59	37	0.11
Wisconsin	251	499	1 703	5	120	80	51	0.21

Tab. 2 Statistics of node classification accuracy for different models

模型	Cora	Citeseer	Pubmed	Cornell	Texas	Wisconsin
GCN	81.50	70.30	79.00	58.65	61.35	54.71
DGI	82.50	71.60	78.40	57.70	59.70	54.80
GCNII	85.50	73.40	80.30	74.86	69.46	74.12
SEP-N	84.80	72.90	80.20	57.40	60.60	61.20
GAT	83.00	72.50	79.00	58.92	58.38	55.29
GATv2	82.30	72.20	78.50	57.84	61.35	54.90
FSGAT	84.40	73.10	80.50	60.00	63.51	56.67
FSGATv2	83.20	73.00	80.70	59.46	64.60	56.08

Tab. 3 Classification confusion matrix of dataset Pubmed under model FSGAT

实际标签	预测标签
实际标签	0	1	2	合计
总计	190	420	390	1 000
0	140	16	24	180
1	22	345	46	413
2	28	59	320	407

Tab. 4 Classification confusion matrix of dataset Pubmed under model FSGATv2

实际标签	预测标签
实际标签	0	1	2	合计
总计	197	421	382	1 000
0	150	15	15	180
1	23	340	50	413
2	24	66	317	407

Tab. 5 Statistics of node classification accuracy in ablation experiments

算法	Cora	Citeseer	Pubmed	Cornell	Texas	Wisconsin
GAT_map	83.30	71.80	79.30	58.38	61.62	56.27
GAT_fs	83.60	72.80	79.40	58.37	63.24	55.49
GATv2_map	82.10	72.30	79.20	58.65	63.51	55.29
GATv2_fs	82.30	71.40	79.30	59.19	62.70	54.71

Fig. 2 Similarity heat maps of datasets Cora and Cornell

Tab. 6 Statistics of featuresub’s node classification accuracy on GAT， GATv2 and GCN

数据集	$F 1$	$F 2$	d	不同模型的节点分类准确率/%
数据集	$F 1$	$F 2$	d	GAT_1	GAT_2	GAT	GATv2_1	GATv2_2	GATv2	GCN_1	GCN_2	GCN
Cora	92	100	1 433	82.10	81.10	83.00	79.50	80.30	82.30	79.50	79.10	81.50
Citeseer	157	146	3 703	72.00	70.80	72.50	70.08	71.90	72.20	68.30	68.10	70.30
Pubmed	51	46	500	78.50	78.20	79.00	79.40	79.60	78.50	78.10	78.10	79.00
Cornell	219	182	1 703	55.95	54.05	58.92	58.37	58.37	57.84	58.64	59.20	58.65
Texas	129	158	1 703	58.91	59.46	58.38	59.45	60.80	61.35	61.08	60.81	61.35
Wisconsin	137	128	1 703	55.29	53.29	55.29	54.90	55.88	54.90	53.73	54.90	54.71

Tab. 6 Statistics of featuresub’s node classification accuracy on GAT， GATv2 and GCN

数据集	$F 1$	$F 2$	d	不同模型的节点分类准确率/%
数据集	$F 1$	$F 2$	d	GAT_1	GAT_2	GAT	GATv2_1	GATv2_2	GATv2	GCN_1	GCN_2	GCN
Cora	92	100	1 433	82.10	81.10	83.00	79.50	80.30	82.30	79.50	79.10	81.50
Citeseer	157	146	3 703	72.00	70.80	72.50	70.08	71.90	72.20	68.30	68.10	70.30
Pubmed	51	46	500	78.50	78.20	79.00	79.40	79.60	78.50	78.10	78.10	79.00
Cornell	219	182	1 703	55.95	54.05	58.92	58.37	58.37	57.84	58.64	59.20	58.65
Texas	129	158	1 703	58.91	59.46	58.38	59.45	60.80	61.35	61.08	60.81	61.35
Wisconsin	137	128	1 703	55.29	53.29	55.29	54.90	55.88	54.90	53.73	54.90	54.71

Tab. 7 Statistics of node classification accuracy in literature ［26］

数据集	选择的特征数量	准确率/%
Cora	225	68.20
Citeseer	450	57.70
Pubmed	105	66.80
Cornell	255	48.85
Texas	255	50.10
Wisconsin	255	43.20

References 37

1	SCARSELLI F， GORI M， TSOI A C， et al. The graph neural network model ［J］. IEEE Transactions on Neural Networks， 2009， 20（1）： 61-80. 10.1109/tnn.2008.2005605
2	SCARSELLI F， TSOI A C， GORI M， et al. Graphical-based learning environments for pattern recognition ［C］// Proceedings of the 2004 Structural， Syntactic， and Statistical Pattern Recognition. Cham： Springer， 2004： 42-56. 10.1007/978-3-540-27868-9_4
3	ZHOU F， CAO C， ZHANG K， et al. Meta-GNN： on few-shot node classification in graph meta-learning ［C］// Proceedings of the 28th ACM International Conference on Information and Knowledge Management. New York： ACM， 2019： 2357-2360. 10.1145/3357384.3358106
4	HANG M， NEVILLE J， RIBEIRO B. A collective learning framework to boost GNN expressiveness for node classification ［EB/OL］. ［2023-01-11］. .
5	CHEN M， WEI Z， HUANG Z， et al. Simple and deep graph convolutional networks ［C］// Proceedings of the 37th International Conference on Machine Learning. New York： JMLR.org， 2020： 1725-1735.
6	ZHAO T， ZHANG X， WANG S. GraphSMOTE： imbalanced node classification on graphs with graph neural networks ［C］// Proceedings of the 14th ACM International Conference on Web Search and Data Mining. New York： ACM， 2021： 833-841. 10.1145/3437963.3441720
7	WU W， LI B， LUO C， et al. Hashing-accelerated graph neural networks for link prediction ［C］// Proceedings of the 2021 Web Conference. New York： ACM， 2021： 2910-2920. 10.1145/3442381.3449884
8	YING R， HE R， CHEN K， et al. Graph convolutional neural networks for web-scale recommender systems ［C］// Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. New York： ACM， 2018： 974-983. 10.1145/3219819.3219890
9	ZHANG M， CHEN Y. Link prediction based on graph neural networks ［C］// Proceedings of the 32nd International Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2018： 5171-5181.
10	WU B， YANG X， PAN S， et al. Adapting membership inference attacks to GNN for graph classification： approaches and implications ［C］// Proceedings of the 2021 IEEE International Conference on Data Mining. Piscataway： IEEE， 2021： 1421-1426. 10.1109/icdm51629.2021.00182
11	LE T， BERTOLINI M， NOÉ F， et al. Parameterized hypercomplex graph neural networks for graph classification ［C］// Proceedings of the 30th International Conference on Artificial Neural Networks. Berlin： Springer， 2021： 204-216. 10.1007/978-3-030-86365-4_17
12	MA H， BIAN Y， RONG Y， et al. Multi-view graph neural networks for molecular property prediction ［EB/OL］. （2020-06-12）［2022-04-21］. . 10.1093/bioinformatics/btac039
13	MENG Y， ZONG S， LI X， et al. GNN-LM： language modeling based on global contexts via GNN ［EB/OL］. （2022-05-04）［2022-10-11］. .
14	MAURYA S K， LIU X， MURATA T. Graph neural networks for fast node ranking approximation ［J］. ACM Transactions on Knowledge Discovery from Data， 2021， 15（5）： 78. 10.1145/3446217
15	WU S， SUN F， ZHANG W， et al. Graph neural networks in recommender systems： a survey ［J］. ACM Computing Surveys， 2022， 55（5）： 97. 10.1145/3535101
16	ZHANG H， LU G， ZHAN M， et al. Semi-supervised classification of graph convolutional networks with Laplacian rank constraints ［J］. Neural Processing Letters， 2022， 54： 2645-2656. 10.1007/s11063-020-10404-7
17	RUIZ L， GAMA F， RIBEIRO A. Gated graph recurrent neural networks ［J］. IEEE Transactions on Signal Processing， 2020， 68： 6303-6318. 10.1109/tsp.2020.3033962
18	HAMILTON W L， YING Z， LESKOVEC J. Inductive representation learning on large graphs ［C］// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2017： 1025-1035. 10.7551/mitpress/11474.003.0014
19	VELIČKOVIĆ P， CUCURULL G， CASANOVA A， et al. Graph attention networks ［EB/OL］. ［2021-06-02］. .
20	BRODY S， ALON U， YAHAV E. How attentive are graph attention networks？［EB/OL］. ［2022-09-05］. .
21	BAHDANAU D， CHO K， BENGIO Y. Neural machine translation by jointly learning to align and translate ［EB/OL］. （2016-05-19）［2021-03-16］. . 10.1017/9781108608480.003
22	WU J， CHEN X， XU K， et al. Structural entropy guided graph hierarchical pooling ［C］// Proceedings of the 39th International Conference on Machine Learning. New York： JMLR.org， 2022： 24017-24030.
23	CHENG H， ZHOU J T， TAY W P， et al. Attentive graph neural networks for few-shot learning ［C］// Proceedings of the 2022 IEEE 5th International Conference on Multimedia Information Processing and Retrieval. Piscataway： IEEE， 2022： 152-157. 10.1109/mipr54900.2022.00033
24	FAN W， LIU K， LIU H， et al. AutoFS： automated feature selection via diversity-aware interactive reinforcement learning［C］// Proceedings of the 2020 IEEE International Conference on Data Mining. Piscataway： IEEE， 2020： 1008-1013. 10.1109/icdm50108.2020.00117
25	ZHAO X， LIU K， FAN W， et al. Simplifying reinforced feature selection via restructured choice strategy of single agent ［C］// Proceedings of the 2020 IEEE International Conference on Data Mining. Piscataway： IEEE， 2020： 871-880. 10.1109/icdm50108.2020.00096
26	ACHARYA D B， ZHANG H. Feature selection and extraction for graph neural networks ［C］// Proceedings of the 2020 ACM Southeast Conference. New York： ACM， 2020： 252-255. 10.1145/3374135.3385309
27	ABID A， BALIN M F， ZOU J. Concrete autoencoders for differentiable feature selection and reconstruction ［EB/OL］. （2019-01-31）［2022-04-25］. .
28	BAMA S S， SARAVANAN A. Efficient classification using average weighted pattern score with attribute rank based feature selection ［J］. International Journal of Intelligent Systems and Applications， 2019， 11（7）： 29-42. 10.5815/ijisa.2019.07.04
29	SREEJA N K， SANKAR A. Pattern matching based classification using ant colony optimization based feature selection ［J］. Applied Soft Computing， 2015， 31： 91-102. 10.1016/j.asoc.2015.02.036
30	XIONG S， LIU R， YI C. Graph-AutoFS： auto feature selection in graph neural ［C］// Proceedings of the 7th International Conference on Computing and Data Engineering. New York： ACM， 2021： 41-46. 10.1145/3456172.3456191
31	MAURYA S K， LIU X， MURATA T. Improving graph neural networks with simple architecture design ［EB/OL］. （2021-05-17）［2021-11-15］. . 10.1145/3446217
32	WANG Y， ZHAO X， XU T， et al. AutoField： automating feature selection in deep recommender systems ［C］// Proceedings of the 2022 ACM Web Conference. New York： ACM， 2022： 1977-1986. 10.1145/3485447.3512071
33	ZHU J， YAN Y， ZHAO L， et al. Beyond homophily in graph neural networks： current limitations and effective designs ［C］// Proceedings of the 34th International Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2020： 7793-7804.
34	SEN P， NAMATA G， BILGIC M， et al. Collective classification in network data［J］. AI Magazine， 2008， 29（3）： 93-107. 10.1609/aimag.v29i3.2157
35	PEI H， WEI B， CHANG K C C， et al. Geom-GCN： geometric graph convolutional networks ［EB/OL］. （2020-02-14）［2021-03-05］. .
36	KIPF T N， WELLING M. Semi-supervised classification with graph convolutional networks ［EB/OL］. （2017-02-22）［2021-05-27］. . 10.48550/arXiv.1609.02907
37	VELICKOVIĆ P， FEDUS W， HAMILTON W L， et al. Deep graph infomax［C/OL］// Proceedings of the 2019 International Conference for Learning Representations. ［2023-03-01］. .

[1]	Yexin PAN, Zhe YANG. Optimization model for small object detection based on multi-level feature bidirectional fusion [J]. Journal of Computer Applications, 2024, 44(9): 2871-2877.
[2]	Tingjie TANG, Jiajin HUANG, Jin QIN. Session-based recommendation with graph auxiliary learning [J]. Journal of Computer Applications, 2024, 44(9): 2711-2718.
[3]	Shunyong LI, Shiyi LI, Rui XU, Xingwang ZHAO. Incomplete multi-view clustering algorithm based on self-attention fusion [J]. Journal of Computer Applications, 2024, 44(9): 2696-2703.
[4]	Yunchuan HUANG, Yongquan JIANG, Juntao HUANG, Yan YANG. Molecular toxicity prediction based on meta graph isomorphism network [J]. Journal of Computer Applications, 2024, 44(9): 2964-2969.
[5]	Jing QIN, Zhiguang QIN, Fali LI, Yueheng PENG. Diagnosis of major depressive disorder based on probabilistic sparse self-attention neural network [J]. Journal of Computer Applications, 2024, 44(9): 2970-2974.
[6]	Xiyuan WANG, Zhancheng ZHANG, Shaokang XU, Baocheng ZHANG, Xiaoqing LUO, Fuyuan HU. Unsupervised cross-domain transfer network for 3D/2D registration in surgical navigation [J]. Journal of Computer Applications, 2024, 44(9): 2911-2918.
[7]	Hong CHEN, Bing QI, Haibo JIN, Cong WU, Li’ang ZHANG. Class-imbalanced traffic abnormal detection based on 1D-CNN and BiGRU [J]. Journal of Computer Applications, 2024, 44(8): 2493-2499.
[8]	Yuhan LIU, Genlin JI, Hongping ZHANG. Video pedestrian anomaly detection method based on skeleton graph and mixed attention [J]. Journal of Computer Applications, 2024, 44(8): 2551-2557.
[9]	Yanjie GU, Yingjun ZHANG, Xiaoqian LIU, Wei ZHOU, Wei SUN. Traffic flow forecasting via spatial-temporal multi-graph fusion [J]. Journal of Computer Applications, 2024, 44(8): 2618-2625.
[10]	Qianhong SHI, Yan YANG, Yongquan JIANG, Xiaocao OUYANG, Wubo FAN, Qiang CHEN, Tao JIANG, Yuan LI. Multi-granularity abrupt change fitting network for air quality prediction [J]. Journal of Computer Applications, 2024, 44(8): 2643-2650.
[11]	Zheng WU, Zhiyou CHENG, Zhentian WANG, Chuanjian WANG, Sheng WANG, Hui XU. Deep learning-based classification of head movement amplitude during patient anaesthesia resuscitation [J]. Journal of Computer Applications, 2024, 44(7): 2258-2263.
[12]	Huanhuan LI, Tianqiang HUANG, Xuemei DING, Haifeng LUO, Liqing HUANG. Public traffic demand prediction based on multi-scale spatial-temporal graph convolutional network [J]. Journal of Computer Applications, 2024, 44(7): 2065-2072.
[13]	Zhi ZHANG, Xin LI, Naifu YE, Kaixi HU. DKP： defending against model stealing attacks based on dark knowledge protection [J]. Journal of Computer Applications, 2024, 44(7): 2080-2086.
[14]	Yiqun ZHAO, Zhiyu ZHANG, Xue DONG. Anisotropic travel time computation method based on dense residual connection physical information neural networks [J]. Journal of Computer Applications, 2024, 44(7): 2310-2318.
[15]	Song XU, Wenbo ZHANG, Yifan WANG. Lightweight video salient object detection network based on spatiotemporal information [J]. Journal of Computer Applications, 2024, 44(7): 2192-2199.