Journal of Computer Applications ›› 2021, Vol. 41 ›› Issue (11): 3156-3163.DOI: 10.11772/j.issn.1001-9081.2021010027
Special Issue: 人工智能
• Artificial intelligence • Previous Articles Next Articles
Zhichao LI, Tohti TURDI(), Hamdulla ASKAR
Received:
2021-01-11
Revised:
2021-05-24
Accepted:
2021-05-25
Online:
2021-11-29
Published:
2021-11-10
Contact:
Tohti TURDI
About author:
LI Zhichao,born in 1993,M. S. candidate. His research interests include question answering system,natural language processingSupported by:
通讯作者:
吐尔地·托合提
作者简介:
李志超(1993—),男,湖南涟源人,硕士研究生,主要研究方向:问答系统、自然语言处理基金资助:
CLC Number:
Zhichao LI, Tohti TURDI, Hamdulla ASKAR. Answer selection model based on dynamic attention and multi-perspective matching[J]. Journal of Computer Applications, 2021, 41(11): 3156-3163.
李志超, 吐尔地·托合提, 艾斯卡尔·艾木都拉. 基于动态注意力和多角度匹配的答案选择模型[J]. 《计算机应用》唯一官方网站, 2021, 41(11): 3156-3163.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2021010027
数据集 | 数据集 类别 | 问题 数量 | 问题-答案 数量 | 正样本 占比/% |
---|---|---|---|---|
TRECQA | Train | 1 229 | 53 417 | 12.0 |
Validation | 65 | 1 117 | 18.4 | |
Test | 68 | 1 442 | 17.2 | |
WikiQA | Train | 873 | 8 672 | 12.0 |
Validation | 126 | 1 130 | 12.4 | |
Test | 243 | 2 351 | 12.5 |
Tab. 1 Statistics of TRECQA and WikiQA datasets
数据集 | 数据集 类别 | 问题 数量 | 问题-答案 数量 | 正样本 占比/% |
---|---|---|---|---|
TRECQA | Train | 1 229 | 53 417 | 12.0 |
Validation | 65 | 1 117 | 18.4 | |
Test | 68 | 1 442 | 17.2 | |
WikiQA | Train | 873 | 8 672 | 12.0 |
Validation | 126 | 1 130 | 12.4 | |
Test | 243 | 2 351 | 12.5 |
方法 | MAP | MRR |
---|---|---|
文献[ | 72.8 | 83.2 |
文献[ | 77.7 | 83.6 |
文献[ | 82.1 | 89.9 |
文献[ | 80.2 | 87.5 |
文献[ | 75.3 | 85.1 |
文献[ | 80.1 | 87.7 |
文献[ | 83.8 | 90.4 |
DAMPM(with K-Max) | 82.4 | 90.8 |
DAMPM(with K-Threshold) | 83.7 | 91.5 |
Tab. 2 Experimental results on TRECQA dataset
方法 | MAP | MRR |
---|---|---|
文献[ | 72.8 | 83.2 |
文献[ | 77.7 | 83.6 |
文献[ | 82.1 | 89.9 |
文献[ | 80.2 | 87.5 |
文献[ | 75.3 | 85.1 |
文献[ | 80.1 | 87.7 |
文献[ | 83.8 | 90.4 |
DAMPM(with K-Max) | 82.4 | 90.8 |
DAMPM(with K-Threshold) | 83.7 | 91.5 |
方法 | MAP | MRR |
---|---|---|
文献[ | 75.4 | 76.4 |
文献[ | 74.3 | 75.5 |
文献[ | 68.7 | 69.6 |
文献[ | 65.2 | 66.5 |
文献[ | 69.5 | 71.1 |
文献[ | 70.9 | 72.3 |
文献[ | 74.6 | 79.2 |
DAMPM(with K-Max) | 76.1 | 77.2 |
DAMPM(with K-Threshold) | 75.9 | 76.7 |
Tab. 3 Experimental results on WikiQA dataset
方法 | MAP | MRR |
---|---|---|
文献[ | 75.4 | 76.4 |
文献[ | 74.3 | 75.5 |
文献[ | 68.7 | 69.6 |
文献[ | 65.2 | 66.5 |
文献[ | 69.5 | 71.1 |
文献[ | 70.9 | 72.3 |
文献[ | 74.6 | 79.2 |
DAMPM(with K-Max) | 76.1 | 77.2 |
DAMPM(with K-Threshold) | 75.9 | 76.7 |
模型结构 | 平均倒数排名 | 下降百分比 |
---|---|---|
w/o Full-Matching | 87.2 | 4.700 |
w/o Attentive-Matching | 87.4 | 4.480 |
w/o Max-Attentive Matching | 88.5 | 3.279 |
w/o ELMo | 89.2 | 2.514 |
w/o GloVe | 91.2 | 0.328 |
DAMPM (with K-Threshold) | 91.5 | 0.000 |
Tab. 4 Ablation experimental results on TRECQA validation set
模型结构 | 平均倒数排名 | 下降百分比 |
---|---|---|
w/o Full-Matching | 87.2 | 4.700 |
w/o Attentive-Matching | 87.4 | 4.480 |
w/o Max-Attentive Matching | 88.5 | 3.279 |
w/o ELMo | 89.2 | 2.514 |
w/o GloVe | 91.2 | 0.328 |
DAMPM (with K-Threshold) | 91.5 | 0.000 |
1 | TAN M, SANTOS C DOS, XIANG B, et al. LSTM-based deep learning models for non-factoid answer selection [EB/OL]. (2016-03-28) [2019-01-10]. . 10.18653/v1/p16-1044 |
2 | HE H, GIMPEL K, LIN J. Multi-perspective sentence similarity modeling with convolutional neural networks [C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2015: 1576-1586. 10.18653/v1/d15-1181 |
3 | GARG S, VU T, MOSCHITTII A. TANDA: transfer and adapt pre-trained transformer models for answer sentence selection [EB/OL]. [2020-05-01]. . 10.1609/aaai.v34i05.6282 |
4 | 孙源,王健,张益嘉,等.融合粗细粒度信息的长答案选择神经网络模型[J].中文信息学报,2021,35(4):100-109. 10.3969/j.issn.1003-0077.2021.04.014 |
SUN Y, WANG J, ZHANG Y J, et al. Long answer selection neural model integrating coarse and fine granularity information [J]. Journal of Chinese Information Processing, 2021, 35(4): 100-109. 10.3969/j.issn.1003-0077.2021.04.014 | |
5 | 冯文政,唐杰.融合深度匹配特征的答案选择模型[J].中文信息学报,2019,33(1):118-124. 10.3969/j.issn.1003-0077.2019.01.014 |
FENG W Z, TANG J. Answer selection model integrating depth matching features [J]. Journal of Chinese Information Processing, 2019, 33(1): 118-124. 10.3969/j.issn.1003-0077.2019.01.014 | |
6 | PETERS M E, NEUMANN M, IYYER M, et al. Deep contextualized word representations [C]// Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume1(Long Papers). Stroudsburg: ACL, 2018: 2227-2237. |
7 | KENTER T, BORISOV A, DE RIJKE M. Siamese CBOW: optimizing word embeddings for sentence representations [C]// Proceedings of the 2016 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg: ACL, 2016:941-951. 10.18653/v1/p16-1089 |
8 | MUELLER J, THYAGARAJAN A. Siamese recurrent architectures for learning sentence similarity [C]// Proceedings of the 2016 30th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2016: 2786-2792. 10.1609/aaai.v34i10.7136 |
9 | NECULOIU P, VERSTEEGH M, ROTARU M. Learning text similarity with Siamese recurrent networks [C]// Proceedings of the 1st Workshop on Representation Learning for NLP. Stroudsburg: ACL, 2016: 148-157. 10.18653/v1/w16-1617 |
10 | BIAN W J, LI S, YANG Z, et al. A compare-aggregate model with dynamic-clip attention for answer selection [C]// Proceedings of the 2017 ACM Conference on Information and Knowledge Management. New York: ACM, 2017: 1987-1990. 10.1145/3132847.3133089 |
11 | SHA L, ZHNAG X D, QIAN F, et al. A multi-view fusion neural network for answer selection [C]// Proceedings of the 2018 32nd AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2018: 5422-5429. |
12 | YOON S, DERNONCOURT F, KIM D S, et al. A compare-aggregate model with latent clustering for answer selection [C]// Proceedings of the 2019 28th ACM International Conference on Information and Knowledge Management. New York: ACM, 2019: 2093-2096. 10.1145/3357384.3358148 |
13 | WANG S H, JIANG J. A compare-aggregate model for matching text sequences [EB/OL]. (2016-11-06) [2019-05-05]. . 10.1109/ijcnn.2019.8852062 |
14 | TAN M, SANTOS C DOS, XIANG B, et al. Improved representation learning for question answer matching [C]// Proceedings of the 2016 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg: ACL, 2016:464-473. 10.18653/v1/p16-1044 |
15 | SANTOS C DOS, TAN M, XIANG B, et al. Attentive pooling networks [EB/OL]. (2016-02-11) [2019-07-05]. . |
16 | LASKAR M T R, HUANG J, HOQUE E. Contextualized embeddings based transformer encoder for sentence similarity modeling in answer selection task [C]// Proceedings of the 2020 12th Language Resources and Evaluation Conference. Paris: European Language Resources Association, 2020: 5505-5514. |
17 | PENNINGTON J, SOCHER R, MANNING C D. GloVe: global vectors for word representation [C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2014: 1532-1543. 10.3115/v1/d14-1162 |
18 | WANG M Q, SMITH N A, MITAMURA T. What is the Jeopardy model? a quasi-synchronous grammar for QA [C]// Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Stroudsburg: ACL, 2007: 22-32. |
19 | YANG Y, YIH W T, MEEK C. WikiQA: a challenge dataset for open-domain question answering [C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2015: 2013-2018. 10.18653/v1/d15-1237 |
20 | RAO J F, HE H, LIN J. Noise-contrastive estimation for answer selection with deep neural networks [C]// Proceedings of the 2016 25th ACM International on Conference on Information and Knowledge Management. New York: ACM, 2016: 1913-1916. 10.1145/2983323.2983872 |
21 | TAY Y, TUAN L A, HUI S C. Multi-cast attention networks for retrieval-based question answering and response prediction [C]// Proceedings of the 2018 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2018: 2299-2308. 10.1145/3219819.3220048 |
22 | SEVERYN A, MOSCHITTI A. Modeling relational information in question-answer pairs with convolutional neural networks [EB/OL]. (2016-04-05) [2019-08-12]. . 10.1145/2766462.2767738 |
23 | HE H, LIN J. Pairwise word interaction modeling with deep neural networks for semantic similarity measurement [C]// Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg: ACL, 2016: 937-948. 10.18653/v1/n16-1108 |
24 | JIN Z X, ZHANG B W, ZHOU F, et al. Ranking via partial ordering for answer selection [J]. Information Sciences, 2020, 538: 358-371. 10.1016/j.ins.2020.05.110 |
25 | DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding [C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume1(Long and Short Papers). Stroudsburg: ACL, 2016: 4171-4186. |
26 | HOWARD J, RUDER S. Universal language model fine-tuning for text classification [C]// Proceedings of the 2018 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg: ACL, 2018: 328-339. 10.18653/v1/p18-1031 |
[1] | Xingyao YANG, Yu CHEN, Jiong YU, Zulian ZHANG, Jiaying CHEN, Dongxiao WANG. Recommendation model combining self-features and contrastive learning [J]. Journal of Computer Applications, 2024, 44(9): 2704-2710. |
[2] | Xianglan WU, Yang XIAO, Mengying LIU, Mingming LIU. Text-to-SQL model based on semantic enhanced schema linking [J]. Journal of Computer Applications, 2024, 44(9): 2689-2695. |
[3] | Guanglei YAO, Juxia XIONG, Guowu YANG. Flower pollination algorithm based on neural network optimization [J]. Journal of Computer Applications, 2024, 44(9): 2829-2837. |
[4] | Ying HUANG, Jiayu YANG, Jiahao JIN, Bangrui WAN. Siamese mixed information fusion algorithm for RGBT tracking [J]. Journal of Computer Applications, 2024, 44(9): 2878-2885. |
[5] | Jing QIN, Zhiguang QIN, Fali LI, Yueheng PENG. Diagnosis of major depressive disorder based on probabilistic sparse self-attention neural network [J]. Journal of Computer Applications, 2024, 44(9): 2970-2974. |
[6] | Hang YANG, Wanggen LI, Gensheng ZHANG, Zhige WANG, Xin KAI. Multi-layer information interactive fusion algorithm based on graph neural network for session-based recommendation [J]. Journal of Computer Applications, 2024, 44(9): 2719-2725. |
[7] | Na WANG, Lin JIANG, Yuancheng LI, Yun ZHU. Optimization of tensor virtual machine operator fusion based on graph rewriting and fusion exploration [J]. Journal of Computer Applications, 2024, 44(9): 2802-2809. |
[8] | Yun LI, Fuyou WANG, Peiguang JING, Su WANG, Ao XIAO. Uncertainty-based frame associated short video event detection method [J]. Journal of Computer Applications, 2024, 44(9): 2903-2910. |
[9] | Tingjie TANG, Jiajin HUANG, Jin QIN. Session-based recommendation with graph auxiliary learning [J]. Journal of Computer Applications, 2024, 44(9): 2711-2718. |
[10] | Rui ZHANG, Pengyun ZHANG, Meirong GAO. Self-optimized dual-modal multi-channel non-deep vestibular schwannoma recognition model [J]. Journal of Computer Applications, 2024, 44(9): 2975-2982. |
[11] | Jinjin LI, Guoming SANG, Yijia ZHANG. Multi-domain fake news detection model enhanced by APK-CNN and Transformer [J]. Journal of Computer Applications, 2024, 44(9): 2674-2682. |
[12] | Yu DU, Yan ZHU. Constructing pre-trained dynamic graph neural network to predict disappearance of academic cooperation behavior [J]. Journal of Computer Applications, 2024, 44(9): 2726-2731. |
[13] | Yubo ZHAO, Liping ZHANG, Sheng YAN, Min HOU, Mao GAO. Relation extraction between discipline knowledge entities based on improved piecewise convolutional neural network and knowledge distillation [J]. Journal of Computer Applications, 2024, 44(8): 2421-2429. |
[14] | Hong CHEN, Bing QI, Haibo JIN, Cong WU, Li’ang ZHANG. Class-imbalanced traffic abnormal detection based on 1D-CNN and BiGRU [J]. Journal of Computer Applications, 2024, 44(8): 2493-2499. |
[15] | Ying YANG, Xiaoyan HAO, Dan YU, Yao MA, Yongle CHEN. Graph data generation approach for graph neural network model extraction attacks [J]. Journal of Computer Applications, 2024, 44(8): 2483-2492. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||