Short text classification method by fusing corpus features and graph attention network

doi:10.11772/j.issn.1001-9081.2021030508

Abstract

Abstract:

Short text classification is an important research problem of Natural Language Processing （NLP）， and is widely used in news classification， sentiment analysis， comment analysis and other fields. Aiming at the problem of data sparsity in short text classification， by introducing node and edge weight features of corpora， based on Graph ATtention network （GAT）， a new graph attention network named Node-Edge GAT （NE-GAT） by fusing node and edge weight features was proposed. Firstly， a heterogeneous graph was constructed for each corpus， Gravity Model （GM） was used to evaluate the importance of word nodes， and edge weights were obtained through Point Mutual Information （PMI） between nodes. Secondly， a text-level graph was constructed for each sentence， node importance and edge weights were integrated into the update process of nodes. Experimental results show that， the average accuracy of the proposed model on the test sets reaches 75.48%， which is better than those of the models such as Text Graph Convolution Network （Text-GCN）， Text-Level-Graph Neural Network （TL-GNN） and Text classification method for INductive word representations via Graph neural networks （Text-ING）. Compared with original GAT， the proposed model has the average accuracy improved by 2.32 percentage points， which verifies the effectiveness of the proposed model.

Key words: short text classification, Graph Attention Network (GAT), corpus feature, Gravity Model (GM), Pointwise Mutual Information (PMI)

摘要：

短文本分类是自然语言处理（NLP）中的重要研究问题，广泛应用于新闻分类、情感分析、评论分析等领域。针对短文本分类中存在的数据稀疏性问题，通过引入语料库的节点和边权值特征，基于图注意力网络（GAT），提出了一个融合节点和边权值特征的图注意力网络NE-GAT。首先，针对每个语料库构建异构图，利用引力模型（GM）评估单词节点的重要性，并通过节点间的点互信息（PMI）获得边权重；其次，为每个句子构建文本级别图，并将节点重要性和边权重融入节点更新过程。实验结果表明，所提模型在测试集上的平均准确率达到了75.48%，优于用于文本分类的图卷积网络（Text-GCN）、TL-GNN、Text-ING等模型；相较原始GAT，所提模型的平均准确率提升了2.32个百分点，验证了其有效性。

关键词: 短文本分类, 图注意力网络, 语料库特征, 引力模型, 点互信息

CLC Number:

TP391

Shigang YANG, Yongguo LIU. Short text classification method by fusing corpus features and graph attention network[J]. Journal of Computer Applications, 2022, 42(5): 1324-1329.

杨世刚, 刘勇国. 融合语料库特征与图注意力网络的短文本分类方法[J]. 《计算机应用》唯一官方网站, 2022, 42(5): 1324-1329.

Figures/Tables 8

References 25

1	ALSMADI I M， GAN K H. Review of short-text classification ［J］. International Journal of Web Information Systems， 2019， 15（2）： 155-182. 10.1108/ijwis-12-2017-0083
2	范国凤，刘璟，姚绍文，等.基于语义依存分析的图网络文本分类模型［J］.计算机应用研究，2020，37（12）：3594-3598. 10.19734/j.issn.1001-3695.2019.08.0522
	FAN G F， LIU J， YAO S W， et al. Text classification model with graph network based on semantic dependency parsing ［J］. Application Research of Computers， 2020， 37（12）： 3594-3598. 10.19734/j.issn.1001-3695.2019.08.0522
3	TAO H Q， TONG S W， ZHAO H K， et al. A radical-aware attention-based model for Chinese text classification ［C］// Proceedings of the 2019 33rd AAAI Conference on Artificial Intelligence. Palo Alto： AAAI Press， 2019： 5125-5132. 10.1609/aaai.v33i01.33015125
4	XU J Y， CAI Y， WU X， et al. Incorporating context-relevant concepts into convolutional neural networks for short text classification ［J］. Neurocomputing， 2020， 386： 42-53. 10.1016/j.neucom.2019.08.080
5	VELIČKOVIĆ P， CUCURULL G， CASANOVA A， et al. Graph attention networks ［EB/OL］.［2021-02-10］. .
6	LI Z， REN T， MA X Q， et al. Identifying influential spreaders by gravity model ［J］. Scientific Reports， 2019， 9： Article No.8387. 10.1038/s41598-019-44930-9
7	DILRUKSHI I， DE ZOYSA K. A feature selection method for twitter news classification ［J］. International Journal of Machine Learning and Computing， 2014， 4（4）： 365-370. 10.7763/ijmlc.2014.v4.438
8	KIM Y. Convolutional neural networks for sentence classification ［C］// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg： ACL， 2014： 1746-1751. 10.3115/v1/d14-1181
9	LIU P F， QIU X P， HUANG X J. Recurrent neural network for text classification with multi-task learning ［C］// Proceedings of the 2016 25th International Joint Conference on Artificial Intelligence. California： IJCAI Organization， 2016： 2873-2879. 10.18653/v1/d16-1012
10	VASWANI A， HAZEER N S， PARMAR N， et al. Attention is all you need ［C］// Proceedings of the 2017 31st International Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2017：6000-6010. 10.1016/s0262-4079(17)32358-8
11	ZENG J C， LI J， SONG Y， et al. Topic memory networks for short text classification ［C］// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Stroudsburg： ACL， 2018： 3120-3131. 10.18653/v1/d18-1351
12	ZHANG H D， NI W C， ZHAO M J， et al. Cluster-gated convolutional neural network for short text classification ［C］// Proceedings of the 2019 23rd Conference on Computational Natural Language Learning. Stroudsburg： ACL， 2019： 1002-1011. 10.18653/v1/k19-1094
13	张小川，戴旭尧，刘璐，等.融合多头自注意力机制的中文短文本分类模型［J］.计算机应用，2020，40（12）：3485-3489. 10.11772/j.issn.1001-9081.2020060914
	ZHANG X C， DAI X Y， LIU L， et al. Chinese short text classification model with multi-head self-attention mechanism ［J］. Journal of Computer Applications， 2020， 40（12）： 3485-3489. 10.11772/j.issn.1001-9081.2020060914
14	杨朝强，邵党国，杨志豪，等.多特征融合的中文短文本分类模型［J］.小型微型计算机系统，2020，41（7）：1421-1426. 10.3969/j.issn.1000-1220.2020.07.013
	YANG Z Q， SHAO D G， YANG Z H， et al. Chinese short text classification model with multi-feature fusion ［J］. Journal of Chinese Computer Systems， 2020， 41（7）： 1421-1426. 10.3969/j.issn.1000-1220.2020.07.013
15	CHEN J D， HU Y Z， LIU J P， et al. Deep short text classification with knowledge powered attention ［C］// Proceedings of the 2019 33rd AAAI Conference on Artificial Intelligence. Palo Alto： AAAI Press， 2019： 6252-6259. 10.1609/aaai.v33i01.33016252
16	郑诚，董春阳，黄夏炎.基于BTM图卷积网络的短文本分类方法［J］.计算机工程与应用，2021，57（4）：155-160. 10.3778/j.issn.1002-8331.1912-0051
	ZHENG C， DONG C Y， HUANG X Y. Short text classification method based on BTM graph convolutional network ［J］. Computer Engineering and Applications， 2021， 57（4）： 155-160. 10.3778/j.issn.1002-8331.1912-0051
17	HU L M， YANG T C， SHI C， et al. Heterogeneous graph attention networks for semi-supervised short text classification ［C］// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Stroudsburg： ACL， 2019： 4821-4830. 10.18653/v1/d19-1488
18	YAO L， MAO C S， LUO Y. Graph convolutional networks for text classification ［C］// Proceedings of the 2019 33rd AAAI Conference on Artificial Intelligence. Palo Alto： AAAI Press， 2019： 7370-7377. 10.1609/aaai.v33i01.33017370
19	LIU X E， YOU X X， ZHANG X， et al. Tensor graph convolutional networks for text classification ［C］// Proceedings of the 2020 34th AAAI Conference on Artificial Intelligence. Palo Alto： AAAI Press， 2020： 8409-8416. 10.1609/aaai.v34i05.6359
20	HUANG L Z， MA D H， LI S J， et al. Text level graph neural network for text classification ［C］// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Stroudsburg： ACL， 2019： 3444-3450. 10.18653/v1/d19-1345
21	ZHANG Y F， YU X L， CUI Z Y， et al. Every document owns its structure： inductive text classification via graph neural networks ［C］// Proceedings of the 2020 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg： ACL， 2020： 334-339. 10.18653/v1/2020.acl-main.31
22	LI Y J， ZEMEL R， BROCKSCHMIDT M， et al. Gated graph sequence neural networks ［EB/OL］. ［2021-02-20］. .
23	暨南大学.一种基于图注意力网络的中文短文本分类方法：中国，202011141057.5［P］.2021-03-02. 10.1055/s-0041-1729519
	Jinan University. A Chinese short text classification method based on graph attention networks： CN， 202011141057.5 ［P］. 2021-03-02. 10.1055/s-0041-1729519
24	KINGMA D P， BA J L. Adam： a method for stochastic optimization ［EB/OL］. ［2021-02-20］. .
25	PENNINGTON J， SOCHER R， MANNING C D. GloVe： global vectors for word representation ［C］// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg： ACL， 2014： 1532-1543. 10.3115/v1/d14-1162

数据集	训练集样本数	测试集样本数	类别数	平均长度
Biomedical	17 976	1 998	20	7.8
Dblp	61 422	20 000	6	8.5
MR	7 074	3 554	2	20.4
SST1	9 600	2 210	5	18.4
SST2	7 770	1 821	2	18.5
TREC	5 394	500	6	11.3

数据集	训练集样本数	测试集样本数	类别数	平均长度
Biomedical	17 976	1 998	20	7.8
Dblp	61 422	20 000	6	8.5
MR	7 074	3 554	2	20.4
SST1	9 600	2 210	5	18.4
SST2	7 770	1 821	2	18.5
TREC	5 394	500	6	11.3

模型	Biomedical	Dblp	MR	SST1	SST2	TREC	AVG
Text-CNN	0.665 7	0.766 7	0.761 7	0.412 7	0.810 3	0.968 8	0.731 0
Bi-LSTM	0.643 6	0.752 2	0.758 0	0.403 6	0.806 7	0.972 7	0.722 8
Text-GCN	0.686 2	0.777 7	0.763 4	0.387 3	0.816 6	0.906 0	0.722 9
TL-GNN	0.666 1	0.771 5	0.747 0	0.382 4	0.794 6	0.972 0	0.722 3
STCKA	0.680 2	0.772 4	0.767 0	0.416 7	0.820 7	0.972 0	0.738 2
DE-CNN	0.652 7	0.756 5	0.682 0	0.418 1	0.783 1	0.960 9	0.708 9
Text-ING	0.693 7	0.765 3	0.772 3	0.451 2	0.834 6	0.980 0	0.749 5
NE-GAT	0.703 2	0.783 2	0.780 2	0.446 6	0.835 3	0.980 0	0.754 8

模型	Biomedical	Dblp	MR	SST1	SST2	TREC	AVG
Text-CNN	0.665 7	0.766 7	0.761 7	0.412 7	0.810 3	0.968 8	0.731 0
Bi-LSTM	0.643 6	0.752 2	0.758 0	0.403 6	0.806 7	0.972 7	0.722 8
Text-GCN	0.686 2	0.777 7	0.763 4	0.387 3	0.816 6	0.906 0	0.722 9
TL-GNN	0.666 1	0.771 5	0.747 0	0.382 4	0.794 6	0.972 0	0.722 3
STCKA	0.680 2	0.772 4	0.767 0	0.416 7	0.820 7	0.972 0	0.738 2
DE-CNN	0.652 7	0.756 5	0.682 0	0.418 1	0.783 1	0.960 9	0.708 9
Text-ING	0.693 7	0.765 3	0.772 3	0.451 2	0.834 6	0.980 0	0.749 5
NE-GAT	0.703 2	0.783 2	0.780 2	0.446 6	0.835 3	0.980 0	0.754 8

模型	Biomedical	Dblp	MR	SST1	SST2	TREC	AVG
GAT	0.677 2	0.774 3	0.742 5	0.419 5	0.806 2	0.970 0	0.731 6
GAT+node	0.701 2	0.780 9	0.779 4	0.429 4	0.828 0	0.978 0	0.749 5
GAT+edge	0.699 7	0.778 4	0.771 7	0.429 0	0.814 2	0.974 0	0.744 5
NE-GAT	0.703 2	0.783 2	0.780 2	0.446 6	0.835 3	0.980 0	0.754 8