《计算机应用》唯一官方网站 ›› 2021, Vol. 41 ›› Issue (11): 3156-3163.DOI: 10.11772/j.issn.1001-9081.2021010027

• 人工智能 • 上一篇    下一篇

基于动态注意力和多角度匹配的答案选择模型

李志超, 吐尔地·托合提(), 艾斯卡尔·艾木都拉   

  1. 新疆大学 信息科学与工程学院,乌鲁木齐 830046
  • 收稿日期:2021-01-11 修回日期:2021-05-24 接受日期:2021-05-25 发布日期:2021-11-29 出版日期:2021-11-10
  • 通讯作者: 吐尔地·托合提
  • 作者简介:李志超(1993—),男,湖南涟源人,硕士研究生,主要研究方向:问答系统、自然语言处理
    吐尔地·托合提(1975—),男,新疆乌鲁木齐人,副教授,博士,CCF高级会员,主要研究方向:自然语言处理、文本挖掘、机器学习
    艾斯卡尔·艾木都拉(1972—),男,新疆乌鲁木齐人,教授,博士生导师,博士,CCF高级会员,主要研究方向:智能信息处理、机器学习。
  • 基金资助:
    新疆维吾尔自治区自然科学基金资助项目(2021D01C076)

Answer selection model based on dynamic attention and multi-perspective matching

Zhichao LI, Tohti TURDI(), Hamdulla ASKAR   

  1. College of Information Science and Engineering,Xinjiang University,Urumqi Xinjiang 830046,China
  • Received:2021-01-11 Revised:2021-05-24 Accepted:2021-05-25 Online:2021-11-29 Published:2021-11-10
  • Contact: Tohti TURDI
  • About author:LI Zhichao,born in 1993,M. S. candidate. His research interests include question answering system,natural language processing
    TURDI Tohti,born in 1975,Ph. D.,associate professor. His research interests include natural language processing,text mining,machine learning.
    ASKAR Hamdulla,born in 1972,Ph. D.,professor. His researchinterests include intelligent information processing,machine learning.
  • Supported by:
    the Natural Science Foundation of Xinjiang Uygur Autonomous Region(2021D01C076)

摘要:

针对当前主流神经网络在处理答案选择任务时无法同时满足句子的充分表示以及句子间信息充分交互的问题,提出了基于动态注意力和多角度匹配(DAMPM)的答案选择模型。首先,调用预训练语言模型的嵌入(ELMo)获得包含简单语义信息的词向量;接着,在注意力层采用过滤机制有效地去除句子中的噪声,从而更好地得到问句和答案句的句子表征;其次,在匹配层同时引入多种匹配策略来完成句子向量之间的信息交互;然后,利用双向长短期记忆(BiLSTM)网络对匹配层输出的句子向量进行拼接;最后,通过分类器来计算拼接向量的相似度大小,从而得到问句和答案句之间的语义关联。在文本检索会议问答(TRECQA)数据集上的实验结果表明,与基于比较聚合框架的基线模型中的动态滑动注意力网络(DCAN)方法相比,DAMPM在平均准确率均值(MAP)和平均倒数排名(MRR)两个性能指标上均提高了1.6个百分点。在维基百科问答(WikiQA)数据集上的实验结果表明,DAMPM相较DCAN在两个性能指标上分别提高了0.7个百分点和0.8个百分点。所提DAMPM相较于基线模型中的方法整体上有更好的性能表现。

关键词: 神经网络, 答案选择, 动态注意力机制, 多角度匹配, 预训练语言模型

Abstract:

The current mainstream neural networks cannot satisfy the full expression of sentences and the full information interaction between sentences at the same time when processing answer selection tasks. In order to solve the problems, an answer selection model based on Dynamic Attention and Multi-Perspective Matching (DAMPM) was proposed. Firstly, the pre-trained Embeddings from Language Models (ELMo) was introduced to obtain the word vectors containing simple semantic information. Secondly, the filtering mechanism was used in the attention layer to remove the noise in the sentences effectively, so that the sentence representation of question and answer sentences was obtained in a better way. Thirdly, the multiple matching strategies were introduced in the matching layer at the same time to complete the information interaction between sentence vectors. Then, the sentence vectors output from the matching layer were spliced by the Bidirectional Long Short-Term Memory (BiLSTM) network. Finally, the similarity of splicing vectors was calculated by a classifier, and the semantic correlation between question and answer sentences was acquired. The experimental results on the Text REtrieval Conference Question Answering (TRECQA) dataset show that, compared with the Dynamic-Clip Attention Network (DCAN) method, which is one of the comparison aggregation framework based baseline models, the proposed DAMPM improves the Mean Average Precision (MAP) and Mean Reciprocal Rank (MRR) both by 1.6 percentage points. The experimental results on the Wiki Question Answering (WikiQA) dataset show that, the two performance indices of DAMPM is 0.7 percentage points and 0.8 percentage points higher than those of DCAN respectively. The proposed DAMPM has better performance than the methods in the baseline models in general.

Key words: neural network, answer selection, dynamic attention mechanism, multi-perspective matching, pre-trained language model

中图分类号: