Commonsense reasoning and question answering method with three-dimensional semantic features

doi:10.11772/j.issn.1001-9081.2023010063

Journal of Computer Applications ›› 2024, Vol. 44 ›› Issue (1): 138-144.DOI: 10.11772/j.issn.1001-9081.2023010063

Special Issue: 人工智能

• Artificial intelligence • Previous Articles Next Articles

Commonsense reasoning and question answering method with three-dimensional semantic features

Hongbin WANG¹^,²^,³, Xiao FANG¹^,²^,³, Hong JIANG¹^,²^,³()

^1.Faculty of Information Engineering and Automation，Kunming University of Science and Technology，Kunming Yunnan 650500，China
^2.Yunnan Key Laboratory of Artificial Intelligence （Kunming University of Science and Technology），Kunming Yunnan 650500，China
^3.Yunnan Key Laboratory of Computer Technology Application （Kunming University of Science and Technology），Kunming Yunnan 650500，China

Received:2023-01-30 Revised:2023-05-10 Accepted:2023-05-12 Online:2023-06-06 Published:2024-01-10
Contact: Hong JIANG
About author:WANG Hongbin， born in 1983， Ph. D.， professor. His research interests include natural language processing， information retrieval， machine learning.
FANG Xiao， born in 1997， M. S. candidate. Her research interests include intelligent information processing.
Supported by:
National Natural Science Foundation of China(61966020);Basic Research Program of Yunnan Province(202201AT070157)

融入三维语义特征的常识推理问答方法

王红斌¹^,²^,³, 房晓¹^,²^,³, 江虹¹^,²^,³()

^1.昆明理工大学信息工程与自动化学院, 昆明 650500
^2.云南省人工智能重点实验室(昆明理工大学), 昆明 650500
^3.云南省计算机技术应用重点实验室(昆明理工大学), 昆明 650500

通讯作者: 江虹
作者简介:王红斌（1983—），男，云南曲靖人，教授，博士，CCF会员，主要研究方向：自然语言处理、信息检索、机器学习；
房晓（1997—），女，山东烟台人，硕士研究生，CCF会员，主要研究方向：智能信息处理；
第一联系人：江虹（1965—），男，云南昆明人，讲师，硕士，主要研究方向：智能信息处理。
基金资助:
国家自然科学基金资助项目(61966020);云南省基础研究计划项目(202201AT070157)

Abstract

Abstract:

The existing commonsense question answering methods based on pre-trained language model and knowledge graph mainly focus on the construction of subgraphs of knowledge graph and combination of cross-modal information， ignoring the rich semantic features of knowledge graph itself， and lack dynamic adjustment of correlation among knowledge graph subgraph nodes to different question answering tasks， thus they do not achieve satisfactory prediction accuracies. To solve these above problems， a commonsense reasoning and question answering method integrating three-dimensional semantic features was proposed. Firstly， the quantitative indicators of three-dimensional semantic features at relation level， entity level and triple level for knowledge graph nodes were proposed. Secondly， the importance of semantic features of three dimensions of relation level， entity level and triple level to different entity nodes was dynamically calculated through attention mechanism. Finally， multi-layer aggregation iterative embedding of three-dimensional semantic features was carried out through graph neural network， to obtain more extrapolated knowledge representation， update subgraph node representation of knowledge graph， and improve the accuracy of answer prediction. Compared with QA-GNN commonsense question answering and reasoning method， the accuracy of proposed method in verification set and test set of CommonsenseQA dataset was improved by 1.70 percentage points and 0.74 percentage points， and the accuracy of the proposed method by AristoRoBERTa data processing method on OpenBookQA dataset was improved by 1.13 percentage points. Experimental results show that the proposed commonsense reasoning and question answering method integrating three-dimensional semantic features can effectively improve the accuracy of commonsense question answering tasks.

Key words: commonsense question answering, knowledge graph, graph neural network, semantic feature, attention mechanism

摘要：

现有使用预训练语言模型和知识图谱的常识问答方法主要集中于构建知识图谱子图及跨模态信息结合的研究，忽略了知识图谱自身丰富的语义特征，且缺少对不同问答任务的知识图谱子图节点相关性的动态调整，导致预测准确率低。为解决以上问题，提出一种融入三维语义特征的常识推理问答方法。首先提出知识图谱节点的关系层级、实体层级、三元组层级三维语义特征量化指标；其次，通过注意力机制动态计算关系层级、实体层级、三元组层级三种维度的语义特征对不同实体节点间的重要性；最后，通过图神经网络进行多层聚合迭代嵌入三维语义特征，获得更多的外推知识表示，更新知识图谱子图节点表示，提升答案预测精度。与QA-GNN常识问答推理方法相比，所提方法在CommonsenseQA数据集上的验证集和测试集的准确率分别提高了1.70个百分点和0.74个百分点，在OpenBookQA数据集上使用AristoRoBERTa数据处理方法的准确率提高了1.13个百分点。实验结果表明，所提出的融入三维语义特征的常识推理问答方法能够有效提高常识问答任务准确率。

关键词: 常识问答, 知识图谱, 图神经网络, 语义特征, 注意力机制

CLC Number:

TP391.1

Hongbin WANG, Xiao FANG, Hong JIANG. Commonsense reasoning and question answering method with three-dimensional semantic features[J]. Journal of Computer Applications, 2024, 44(1): 138-144.

王红斌, 房晓, 江虹. 融入三维语义特征的常识推理问答方法[J]. 《计算机应用》唯一官方网站, 2024, 44(1): 138-144.

Figures/Tables 8

References 30

1	RAJPURKAR P， ZHANG J， LOPYREV K， et al. SQuAD： 100000+ questions for machine comprehension of text ［C］// Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Stroudsburg， PA： ACL， 2016： 2383-2392. 10.18653/v1/d16-1264
2	MIN S， ZHONG V， ZETTLEMOYER L， et al. Multi-hop reading comprehension through question decomposition and rescoring ［C］// Proceedings of the 57th Conference of the Association for Computational Linguistics. Stroudsburg， PA： ACL， 2019： 6097-6109. 10.18653/v1/p19-1613
3	YANG Z， QI P， ZHANG S， et al. HotpotQA： A dataset for diverse， explainable multi-hop question answering ［C］// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Stroudsburg， PA： ACL， 2018： 2369-2380. 10.18653/v1/d18-1259
4	TALMOR A， HERZIG J， LOURIE N， et al. CommonsenseQA： A question answering challenge targeting commonsense knowledge ［C］// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics： Human Language Technologies， Volume 1（Long and Short Papers）. Stroudsburg， PA： ACL， 2019： 4149-4158.
5	YASUNAGA M， REN H， BOSSELUT A， et al. QA-GNN： Reasoning with language models and knowledge graphs for question answering ［C］// Proceedings of the 2021 Conference on North American Chapter of the Association for Computational Linguistics： Human Language Technologies. Stroudsburg， PA： ACL， 2021： 535-546. 10.18653/v1/2021.naacl-main.45
6	SUN Y， SHI Q， QI L， et al. JointLK： Joint reasoning with language models and knowledge graphs for commonsense question answering ［C］// Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics： Human Language Technologies. Stroudsburg， PA： ACL， 2022： 5049-5060. 10.18653/v1/2022.naacl-main.372
7	ZHENG C， KORDJAMSHIDI P. Dynamic relevance graph network for knowledge-aware question answering ［C］// Proceedings of the 29th International Conference on Computational Linguistics. ［S.l.］： International Committee on Computational Linguistics， 2022： 1357-1366.
8	白铂，刘玉婷，马驰骋，等.图神经网络［J］.中国科学：数学， 2020， 50（3）： 367-384. 10.1360/n012019-00133
	BAI B， LIU Y T， MA C C， et al. Graph neural network ［J］. SCIENTIA SINICA Mathematica， 2020， 50（3）： 367-384. 10.1360/n012019-00133
9	MIHAYLOV T， CLARK P， KHOT T， et al. Can a suit of armor conduct electricity？ a new dataset for open book question answering ［C］// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Stroudsburg， PA： ACL， 2018： 2381-2391. 10.18653/v1/d18-1260
10	DEVLIN J， CHANG M W， LEE K， et al. BERT： Pre-training of deep bidirectional transformers for language understanding ［C］// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics： Human Language Technologies， Volume 1（Long and Short Papers）. Stroudsburg， PA： ACL， 2019： 4171-4186. 10.18653/v1/n18-2
11	FAGHIHI H R， KORDJAMSHIDI P. Time-stamped language model： teaching language models to understand the flow of events ［C］// Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics： Human Language Technologies. Stroudsburg， PA： ACL， 2021： 4560-4570. 10.18653/v1/2021.naacl-main.362
12	LIN B Y， CHEN X， CHEN J， et al. KagNet： Knowledge-aware graph networks for commonsense reasoning ［C］// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Stroudsburg， PA： ACL， 2019： 2829-2839. 10.18653/v1/d19-1282
13	KIPF T N， WELLING M. Semi-supervised classification with graph convolutional networks ［EB/OL］. （2017-02-22）［2022-06-20］. . 10.48550/arXiv.1609.02907
14	FENG Y， CHEN X， LIN B Y， et al. Scalable multi-hop relational reasoning for knowledge-aware question answering ［C］// Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. Stroudsburg， PA： ACL， 2020： 1295-1309. 10.18653/v1/2020.emnlp-main.99
15	FANG Y， SUN S， GAN Z， et al. Hierarchical graph network for multi-hop question answering ［C］// Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. Stroudsburg， PA： ACL， 2020： 8823-8838. 10.18653/v1/2020.emnlp-main.710
16	ZHENG C， KORDJAMSHIDI P. SRLGRN： Semantic role labeling graph reasoning network ［C］// Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. Stroudsburg， PA： ACL， 2020： 8881-8891. 10.18653/v1/2020.emnlp-main.714
17	SCHLICHTKRULL M S， KIPF T N， BLOEM P， et al. Modeling relational data with graph convolutional networks ［C］// Proceedings of the 2018 European Semantic Web Conference， LNCS 10843. Cham： Springer， 2018： 593-607.
18	VASHISHTH S， SANYAL S， NITIN V， et al. Composition-based multi-relational graph convolutional networks ［EB/OL］. （2020-01-18）［2022-06-20］. .
19	SPEER R， CHIN J， HAVASI C. ConceptNet 5.5： An open multilingual graph of general knowledge ［C］// Proceedings of the 31st AAAI Conference on Artificial Intelligence. Palo Alto， CA： AAAI Press， 2017： 4444-4451. 10.1609/aaai.v31i1.11164
20	LI R， CAO Y， ZHU Q， et al. How does knowledge graph embedding extrapolate to unseen data： a semantic evidence view ［C］// Proceedings of the 36th AAAI Conference on Artificial Intelligence. Palo Alto， CA： AAAI Press， 2022： 5781-5791. 10.1609/aaai.v36i5.20521
21	CHOUDHURY A， SHARMA S， MITRA P， et al. SimCat： An entity similarity measure for heterogeneous knowledge graph with categories ［C］// Proceedings of the 2nd ACM IKDD Conference on Data Sciences. New York： ACM， 2015： 112-113. 10.1145/2732587.2732604
22	ZHU G， IGLESIAS C A. Computing semantic similarity of concepts in knowledge graphs ［J］. IEEE Transactions on Knowledge and Data Engineering， 2017， 29（1）： 72-85. 10.1109/tkde.2016.2610428
23	HARRIS Z S. Distributional structure ［J］. Word， 1954， 10（2/3）： 146-162. 10.1080/00437956.1954.11659520
24	SRIVASTAVA N， HINTON G， KRIZHEVSKY A， et al. Dropout： a simple way to prevent neural networks from overfitting ［J］. Journal of Machine Learning Research， 2014， 15： 1929-1958.
25	LIU L， JIANG H， HE P， et al. On the variance of the adaptive learning rate and beyond ［EB/OL］. （2021-10-26）［2022-06-20］. .
26	PASCANU R， MIKOLOV T， BENGIO Y. On the difficulty of training recurrent neural networks ［C］// Proceedings of the 30th International Conference on Machine Learning. New York： JMLR.org， 2013： 1310-1318.
27	SANTORO A， RAPOSO D， BARRETT D G， et al. A simple neural network module for relational reasoning ［C］// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2017： 4974-4983.
28	WANG X， KAPANIPATHI P， MUSA R， et al. Improving natural language inference using external knowledge in the science questions domain ［C］// Proceedings of the 33rd AAAI Conference on Artificial Intelligence. Palo Alto， CA： AAAI Press， 2019： 7208-7215. 10.1609/aaai.v33i01.33017208
29	LIU Y， OTT M， GOYAL N， et al. RoBERTa： A robustly optimized BERT pretraining approach ［EB/OL］. （2019-07-26）［2022-06-20］. .
30	CLARK P， ETZIONI O， KHASHABI D， et al. From F to A on the New York regents science exams： An overview of the aristo project ［J］. AI Magazine， 2020， 41（4）： 39-53. 10.1609/aimag.v41i4.5304

方法	Dev-Acc	Test-Acc
RoBERTa-large （w/o KG）	73.07	68.69
R-GCN	72.69	68.41
GconAttn	72.61	68.59
KagNet	73.47	69.01
RN	74.57	69.08
MHGRN	74.45	71.11
QA-GNN	76.54	73.41
DRGN	78.20	74.00
本文方法	78.24	74.15

方法	Dev-Acc	Test-Acc
RoBERTa-large （w/o KG）	73.07	68.69
R-GCN	72.69	68.41
GconAttn	72.61	68.59
KagNet	73.47	69.01
RN	74.57	69.08
MHGRN	74.45	71.11
QA-GNN	76.54	73.41
DRGN	78.20	74.00
本文方法	78.24	74.15

方法	RoBERTa-Large	AristoRoBERTa
RoBERTa-large （w/o KG）	64.80	78.40
R-GCN	62.45	74.60
GconAttn	64.75	71.80
RN	65.20	75.35
MHGRN	66.85	80.60
QA-GNN	70.58	82.77
DRGN	70.10	81.80
本文方法	70.63	83.90

方法	RoBERTa-Large	AristoRoBERTa
RoBERTa-large （w/o KG）	64.80	78.40
R-GCN	62.45	74.60
GconAttn	64.75	71.80
RN	65.20	75.35
MHGRN	66.85	80.60
QA-GNN	70.58	82.77
DRGN	70.10	81.80
本文方法	70.63	83.90

方法	Dev-Acc
w/o关系级语义特征	77.84
w/o实体级语义特征	77.73
w/o三元组级语义特征	77.74
w/o关系级&实体级语义特征	77.44
w/o关系级&三元组级语义特征	77.24
w/o实体级&三元组级语义特征	76.94
本文方法	78.24

Commonsense reasoning and question answering method with three-dimensional semantic features

融入三维语义特征的常识推理问答方法

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 8

References 30

Related Articles 15

Recommended Articles

Metrics

数据集	样本数
数据集	训练集	验证集	测试集
CommonsenseQA	18 241	2 442	2 381
OpenBookQA	4 957	500	500

[1]	Guixiang XUE, Hui WANG, Weifeng ZHOU, Yu LIU, Yan LI. Port traffic flow prediction based on knowledge graph and spatio-temporal diffusion graph convolutional network [J]. Journal of Computer Applications, 2024, 44(9): 2952-2957.
[2]	Xingyao YANG, Yu CHEN, Jiong YU, Zulian ZHANG, Jiaying CHEN, Dongxiao WANG. Recommendation model combining self-features and contrastive learning [J]. Journal of Computer Applications, 2024, 44(9): 2704-2710.
[3]	Tingjie TANG, Jiajin HUANG, Jin QIN. Session-based recommendation with graph auxiliary learning [J]. Journal of Computer Applications, 2024, 44(9): 2711-2718.
[4]	Zhiqiang ZHAO, Peihong MA, Xinhong HEI. Crowd counting method based on dual attention mechanism [J]. Journal of Computer Applications, 2024, 44(9): 2886-2892.
[5]	Jing QIN, Zhiguang QIN, Fali LI, Yueheng PENG. Diagnosis of major depressive disorder based on probabilistic sparse self-attention neural network [J]. Journal of Computer Applications, 2024, 44(9): 2970-2974.
[6]	Liting LI, Bei HUA, Ruozhou HE, Kuang XU. Multivariate time series prediction model based on decoupled attention mechanism [J]. Journal of Computer Applications, 2024, 44(9): 2732-2738.
[7]	Jie WU, Ansi ZHANG, Maodong WU, Yizong ZHANG, Congbao WANG. Overview of research and application of knowledge graph in equipment fault diagnosis [J]. Journal of Computer Applications, 2024, 44(9): 2651-2659.
[8]	Hang YANG, Wanggen LI, Gensheng ZHANG, Zhige WANG, Xin KAI. Multi-layer information interactive fusion algorithm based on graph neural network for session-based recommendation [J]. Journal of Computer Applications, 2024, 44(9): 2719-2725.
[9]	Yu DU, Yan ZHU. Constructing pre-trained dynamic graph neural network to predict disappearance of academic cooperation behavior [J]. Journal of Computer Applications, 2024, 44(9): 2726-2731.
[10]	Yubo ZHAO, Liping ZHANG, Sheng YAN, Min HOU, Mao GAO. Relation extraction between discipline knowledge entities based on improved piecewise convolutional neural network and knowledge distillation [J]. Journal of Computer Applications, 2024, 44(8): 2421-2429.
[11]	Kaipeng XUE, Tao XU, Chunjie LIAO. Multimodal sentiment analysis network with self-supervision and multi-layer cross attention [J]. Journal of Computer Applications, 2024, 44(8): 2387-2392.
[12]	Pengqi GAO, Heming HUANG, Yonghong FAN. Fusion of coordinate and multi-head attention mechanisms for interactive speech emotion recognition [J]. Journal of Computer Applications, 2024, 44(8): 2400-2406.
[13]	Zhonghua LI, Yunqi BAI, Xuejin WANG, Leilei HUANG, Chujun LIN, Shiyu LIAO. Low illumination face detection based on image enhancement [J]. Journal of Computer Applications, 2024, 44(8): 2588-2594.
[14]	Shangbin MO, Wenjun WANG, Ling DONG, Shengxiang GAO, Zhengtao YU. Single-channel speech enhancement based on multi-channel information aggregation and collaborative decoding [J]. Journal of Computer Applications, 2024, 44(8): 2611-2617.
[15]	Fan YANG, Yao ZOU, Mingzhi ZHU, Zhenwei MA, Dawei CHENG, Changjun JIANG. Credit card fraud detection model based on graph attention Transformation neural network [J]. Journal of Computer Applications, 2024, 44(8): 2634-2642.

GNN层数	Dev-Acc	GNN层数	Dev-Acc
3	75.93	6	77.68
4	77.18	7	77.20
5	78.24

GNN层数	Dev-Acc	GNN层数	Dev-Acc
3	75.93	6	77.68
4	77.18	7	77.20
5	78.24