Journal of Computer Applications ›› 2021, Vol. 41 ›› Issue (11): 3156-3163.DOI: 10.11772/j.issn.1001-9081.2021010027
Special Issue: 人工智能
• Artificial intelligence • Previous Articles Next Articles
					
						                                                                                                                                                                                                                    Zhichao LI, Tohti TURDI( ), Hamdulla ASKAR
), Hamdulla ASKAR
												  
						
						
						
					
				
Received:2021-01-11
															
							
																	Revised:2021-05-24
															
							
																	Accepted:2021-05-25
															
							
							
																	Online:2021-11-29
															
							
																	Published:2021-11-10
															
							
						Contact:
								Tohti TURDI   
													About author:LI Zhichao,born in 1993,M. S. candidate. His research interests include question answering system,natural language processingSupported by:通讯作者:
					吐尔地·托合提
							作者简介:李志超(1993—),男,湖南涟源人,硕士研究生,主要研究方向:问答系统、自然语言处理基金资助:CLC Number:
Zhichao LI, Tohti TURDI, Hamdulla ASKAR. Answer selection model based on dynamic attention and multi-perspective matching[J]. Journal of Computer Applications, 2021, 41(11): 3156-3163.
李志超, 吐尔地·托合提, 艾斯卡尔·艾木都拉. 基于动态注意力和多角度匹配的答案选择模型[J]. 《计算机应用》唯一官方网站, 2021, 41(11): 3156-3163.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2021010027
| 数据集 | 数据集 类别 | 问题 数量 | 问题-答案 数量 | 正样本 占比/% | 
|---|---|---|---|---|
| TRECQA | Train | 1 229 | 53 417 | 12.0 | 
| Validation | 65 | 1 117 | 18.4 | |
| Test | 68 | 1 442 | 17.2 | |
| WikiQA | Train | 873 | 8 672 | 12.0 | 
| Validation | 126 | 1 130 | 12.4 | |
| Test | 243 | 2 351 | 12.5 | 
Tab. 1 Statistics of TRECQA and WikiQA datasets
| 数据集 | 数据集 类别 | 问题 数量 | 问题-答案 数量 | 正样本 占比/% | 
|---|---|---|---|---|
| TRECQA | Train | 1 229 | 53 417 | 12.0 | 
| Validation | 65 | 1 117 | 18.4 | |
| Test | 68 | 1 442 | 17.2 | |
| WikiQA | Train | 873 | 8 672 | 12.0 | 
| Validation | 126 | 1 130 | 12.4 | |
| Test | 243 | 2 351 | 12.5 | 
| 方法 | MAP | MRR | 
|---|---|---|
| 文献[ | 72.8 | 83.2 | 
| 文献[ | 77.7 | 83.6 | 
| 文献[ | 82.1 | 89.9 | 
| 文献[ | 80.2 | 87.5 | 
| 文献[ | 75.3 | 85.1 | 
| 文献[ | 80.1 | 87.7 | 
| 文献[ | 83.8 | 90.4 | 
| DAMPM(with K-Max) | 82.4 | 90.8 | 
| DAMPM(with K-Threshold) | 83.7 | 91.5 | 
Tab. 2 Experimental results on TRECQA dataset
| 方法 | MAP | MRR | 
|---|---|---|
| 文献[ | 72.8 | 83.2 | 
| 文献[ | 77.7 | 83.6 | 
| 文献[ | 82.1 | 89.9 | 
| 文献[ | 80.2 | 87.5 | 
| 文献[ | 75.3 | 85.1 | 
| 文献[ | 80.1 | 87.7 | 
| 文献[ | 83.8 | 90.4 | 
| DAMPM(with K-Max) | 82.4 | 90.8 | 
| DAMPM(with K-Threshold) | 83.7 | 91.5 | 
| 方法 | MAP | MRR | 
|---|---|---|
| 文献[ | 75.4 | 76.4 | 
| 文献[ | 74.3 | 75.5 | 
| 文献[ | 68.7 | 69.6 | 
| 文献[ | 65.2 | 66.5 | 
| 文献[ | 69.5 | 71.1 | 
| 文献[ | 70.9 | 72.3 | 
| 文献[ | 74.6 | 79.2 | 
| DAMPM(with K-Max) | 76.1 | 77.2 | 
| DAMPM(with K-Threshold) | 75.9 | 76.7 | 
Tab. 3 Experimental results on WikiQA dataset
| 方法 | MAP | MRR | 
|---|---|---|
| 文献[ | 75.4 | 76.4 | 
| 文献[ | 74.3 | 75.5 | 
| 文献[ | 68.7 | 69.6 | 
| 文献[ | 65.2 | 66.5 | 
| 文献[ | 69.5 | 71.1 | 
| 文献[ | 70.9 | 72.3 | 
| 文献[ | 74.6 | 79.2 | 
| DAMPM(with K-Max) | 76.1 | 77.2 | 
| DAMPM(with K-Threshold) | 75.9 | 76.7 | 
| 模型结构 | 平均倒数排名 | 下降百分比 | 
|---|---|---|
| w/o Full-Matching | 87.2 | 4.700 | 
| w/o Attentive-Matching | 87.4 | 4.480 | 
| w/o Max-Attentive Matching | 88.5 | 3.279 | 
| w/o ELMo | 89.2 | 2.514 | 
| w/o GloVe | 91.2 | 0.328 | 
| DAMPM (with K-Threshold) | 91.5 | 0.000 | 
Tab. 4 Ablation experimental results on TRECQA validation set
| 模型结构 | 平均倒数排名 | 下降百分比 | 
|---|---|---|
| w/o Full-Matching | 87.2 | 4.700 | 
| w/o Attentive-Matching | 87.4 | 4.480 | 
| w/o Max-Attentive Matching | 88.5 | 3.279 | 
| w/o ELMo | 89.2 | 2.514 | 
| w/o GloVe | 91.2 | 0.328 | 
| DAMPM (with K-Threshold) | 91.5 | 0.000 | 
| 1 | TAN M, SANTOS C DOS, XIANG B, et al. LSTM-based deep learning models for non-factoid answer selection [EB/OL]. (2016-03-28) [2019-01-10]. . 10.18653/v1/p16-1044 | 
| 2 | HE H, GIMPEL K, LIN J. Multi-perspective sentence similarity modeling with convolutional neural networks [C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2015: 1576-1586. 10.18653/v1/d15-1181 | 
| 3 | GARG S, VU T, MOSCHITTII A. TANDA: transfer and adapt pre-trained transformer models for answer sentence selection [EB/OL]. [2020-05-01]. . 10.1609/aaai.v34i05.6282 | 
| 4 | 孙源,王健,张益嘉,等.融合粗细粒度信息的长答案选择神经网络模型[J].中文信息学报,2021,35(4):100-109. 10.3969/j.issn.1003-0077.2021.04.014 | 
| SUN Y, WANG J, ZHANG Y J, et al. Long answer selection neural model integrating coarse and fine granularity information [J]. Journal of Chinese Information Processing, 2021, 35(4): 100-109. 10.3969/j.issn.1003-0077.2021.04.014 | |
| 5 | 冯文政,唐杰.融合深度匹配特征的答案选择模型[J].中文信息学报,2019,33(1):118-124. 10.3969/j.issn.1003-0077.2019.01.014 | 
| FENG W Z, TANG J. Answer selection model integrating depth matching features [J]. Journal of Chinese Information Processing, 2019, 33(1): 118-124. 10.3969/j.issn.1003-0077.2019.01.014 | |
| 6 | PETERS M E, NEUMANN M, IYYER M, et al. Deep contextualized word representations [C]// Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume1(Long Papers). Stroudsburg: ACL, 2018: 2227-2237. | 
| 7 | KENTER T, BORISOV A, DE RIJKE M. Siamese CBOW: optimizing word embeddings for sentence representations [C]// Proceedings of the 2016 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg: ACL, 2016:941-951. 10.18653/v1/p16-1089 | 
| 8 | MUELLER J, THYAGARAJAN A. Siamese recurrent architectures for learning sentence similarity [C]// Proceedings of the 2016 30th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2016: 2786-2792. 10.1609/aaai.v34i10.7136 | 
| 9 | NECULOIU P, VERSTEEGH M, ROTARU M. Learning text similarity with Siamese recurrent networks [C]// Proceedings of the 1st Workshop on Representation Learning for NLP. Stroudsburg: ACL, 2016: 148-157. 10.18653/v1/w16-1617 | 
| 10 | BIAN W J, LI S, YANG Z, et al. A compare-aggregate model with dynamic-clip attention for answer selection [C]// Proceedings of the 2017 ACM Conference on Information and Knowledge Management. New York: ACM, 2017: 1987-1990. 10.1145/3132847.3133089 | 
| 11 | SHA L, ZHNAG X D, QIAN F, et al. A multi-view fusion neural network for answer selection [C]// Proceedings of the 2018 32nd AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2018: 5422-5429. | 
| 12 | YOON S, DERNONCOURT F, KIM D S, et al. A compare-aggregate model with latent clustering for answer selection [C]// Proceedings of the 2019 28th ACM International Conference on Information and Knowledge Management. New York: ACM, 2019: 2093-2096. 10.1145/3357384.3358148 | 
| 13 | WANG S H, JIANG J. A compare-aggregate model for matching text sequences [EB/OL]. (2016-11-06) [2019-05-05]. . 10.1109/ijcnn.2019.8852062 | 
| 14 | TAN M, SANTOS C DOS, XIANG B, et al. Improved representation learning for question answer matching [C]// Proceedings of the 2016 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg: ACL, 2016:464-473. 10.18653/v1/p16-1044 | 
| 15 | SANTOS C DOS, TAN M, XIANG B, et al. Attentive pooling networks [EB/OL]. (2016-02-11) [2019-07-05]. . | 
| 16 | LASKAR M T R, HUANG J, HOQUE E. Contextualized embeddings based transformer encoder for sentence similarity modeling in answer selection task [C]// Proceedings of the 2020 12th Language Resources and Evaluation Conference. Paris: European Language Resources Association, 2020: 5505-5514. | 
| 17 | PENNINGTON J, SOCHER R, MANNING C D. GloVe: global vectors for word representation [C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2014: 1532-1543. 10.3115/v1/d14-1162 | 
| 18 | WANG M Q, SMITH N A, MITAMURA T. What is the Jeopardy model? a quasi-synchronous grammar for QA [C]// Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Stroudsburg: ACL, 2007: 22-32. | 
| 19 | YANG Y, YIH W T, MEEK C. WikiQA: a challenge dataset for open-domain question answering [C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2015: 2013-2018. 10.18653/v1/d15-1237 | 
| 20 | RAO J F, HE H, LIN J. Noise-contrastive estimation for answer selection with deep neural networks [C]// Proceedings of the 2016 25th ACM International on Conference on Information and Knowledge Management. New York: ACM, 2016: 1913-1916. 10.1145/2983323.2983872 | 
| 21 | TAY Y, TUAN L A, HUI S C. Multi-cast attention networks for retrieval-based question answering and response prediction [C]// Proceedings of the 2018 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2018: 2299-2308. 10.1145/3219819.3220048 | 
| 22 | SEVERYN A, MOSCHITTI A. Modeling relational information in question-answer pairs with convolutional neural networks [EB/OL]. (2016-04-05) [2019-08-12]. . 10.1145/2766462.2767738 | 
| 23 | HE H, LIN J. Pairwise word interaction modeling with deep neural networks for semantic similarity measurement [C]// Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg: ACL, 2016: 937-948. 10.18653/v1/n16-1108 | 
| 24 | JIN Z X, ZHANG B W, ZHOU F, et al. Ranking via partial ordering for answer selection [J]. Information Sciences, 2020, 538: 358-371. 10.1016/j.ins.2020.05.110 | 
| 25 | DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding [C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume1(Long and Short Papers). Stroudsburg: ACL, 2016: 4171-4186. | 
| 26 | HOWARD J, RUDER S. Universal language model fine-tuning for text classification [C]// Proceedings of the 2018 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg: ACL, 2018: 328-339. 10.18653/v1/p18-1031 | 
| [1] | Xingyao YANG, Yu CHEN, Jiong YU, Zulian ZHANG, Jiaying CHEN, Dongxiao WANG. Recommendation model combining self-features and contrastive learning [J]. Journal of Computer Applications, 2024, 44(9): 2704-2710. | 
| [2] | Xianglan WU, Yang XIAO, Mengying LIU, Mingming LIU. Text-to-SQL model based on semantic enhanced schema linking [J]. Journal of Computer Applications, 2024, 44(9): 2689-2695. | 
| [3] | Guanglei YAO, Juxia XIONG, Guowu YANG. Flower pollination algorithm based on neural network optimization [J]. Journal of Computer Applications, 2024, 44(9): 2829-2837. | 
| [4] | Ying HUANG, Jiayu YANG, Jiahao JIN, Bangrui WAN. Siamese mixed information fusion algorithm for RGBT tracking [J]. Journal of Computer Applications, 2024, 44(9): 2878-2885. | 
| [5] | Jing QIN, Zhiguang QIN, Fali LI, Yueheng PENG. Diagnosis of major depressive disorder based on probabilistic sparse self-attention neural network [J]. Journal of Computer Applications, 2024, 44(9): 2970-2974. | 
| [6] | Hang YANG, Wanggen LI, Gensheng ZHANG, Zhige WANG, Xin KAI. Multi-layer information interactive fusion algorithm based on graph neural network for session-based recommendation [J]. Journal of Computer Applications, 2024, 44(9): 2719-2725. | 
| [7] | Na WANG, Lin JIANG, Yuancheng LI, Yun ZHU. Optimization of tensor virtual machine operator fusion based on graph rewriting and fusion exploration [J]. Journal of Computer Applications, 2024, 44(9): 2802-2809. | 
| [8] | Yun LI, Fuyou WANG, Peiguang JING, Su WANG, Ao XIAO. Uncertainty-based frame associated short video event detection method [J]. Journal of Computer Applications, 2024, 44(9): 2903-2910. | 
| [9] | Tingjie TANG, Jiajin HUANG, Jin QIN. Session-based recommendation with graph auxiliary learning [J]. Journal of Computer Applications, 2024, 44(9): 2711-2718. | 
| [10] | Rui ZHANG, Pengyun ZHANG, Meirong GAO. Self-optimized dual-modal multi-channel non-deep vestibular schwannoma recognition model [J]. Journal of Computer Applications, 2024, 44(9): 2975-2982. | 
| [11] | Jinjin LI, Guoming SANG, Yijia ZHANG. Multi-domain fake news detection model enhanced by APK-CNN and Transformer [J]. Journal of Computer Applications, 2024, 44(9): 2674-2682. | 
| [12] | Yu DU, Yan ZHU. Constructing pre-trained dynamic graph neural network to predict disappearance of academic cooperation behavior [J]. Journal of Computer Applications, 2024, 44(9): 2726-2731. | 
| [13] | Yubo ZHAO, Liping ZHANG, Sheng YAN, Min HOU, Mao GAO. Relation extraction between discipline knowledge entities based on improved piecewise convolutional neural network and knowledge distillation [J]. Journal of Computer Applications, 2024, 44(8): 2421-2429. | 
| [14] | Hong CHEN, Bing QI, Haibo JIN, Cong WU, Li’ang ZHANG. Class-imbalanced traffic abnormal detection based on 1D-CNN and BiGRU [J]. Journal of Computer Applications, 2024, 44(8): 2493-2499. | 
| [15] | Ying YANG, Xiaoyan HAO, Dan YU, Yao MA, Yongle CHEN. Graph data generation approach for graph neural network model extraction attacks [J]. Journal of Computer Applications, 2024, 44(8): 2483-2492. | 
| Viewed | ||||||
| Full text |  | |||||
| Abstract |  | |||||