• •    

基于动态注意力和多角度匹配的答案选择

李志超1,吐尔地·托合提1,艾斯卡尔?艾木都拉2   

  1. 1. 新疆大学信息科学与工程学院
    2. 新疆大学 信息科学与工程学院,乌鲁木齐 830046
  • 收稿日期:2021-01-07 修回日期:2021-05-24 发布日期:2021-05-24
  • 通讯作者: 吐尔地·托合提

Answer Selection with Dynamic Attention and Multi-perspective Matching

  • Received:2021-01-07 Revised:2021-05-24 Online:2021-05-24
  • Contact: TUERDI Tuoheti

摘要: 针对当前主流神经网络在处理答案选择任务时无法同时满足句子的充分表示以及句子间信息充分交互的问题,提出了一种动态注意力以及多角度匹配的方法(DAMPM)。先调用语言模型的词嵌入方法(ELMo预训练模型)获得包含简单语义信息的词向量,然后在注意力层采用过滤机制有效的去除句子中的噪音,得到问句和答案句的句子表征。其次,在匹配层同时引入多种匹配策略完成句子向量之间的信息交互。最后采用双向长短期记忆网络(BiLSTM)将匹配层获得的句子向量进行拼接后计算句子相似度。在文本检索会议问答(TRECQA)数据集上的实验结果表明,与基线模型中基于比较聚合框架中的最优算法---动态滑动注意力算法(DCAN)相比,DAMPM在平均准确率均值(MAP)和平均倒排名(MRR)两个性能指标上均提高了1.6%。其次,在维基百科问答(WIKIQA)数据集上的实验结果表明,DAMPM相比DCAN在两个性能指标上分别提高了0.7%和0.3%。总的实验结果表明,DAMPM相比基线模型整体上有更好的性能表现。

关键词: 神经网络, 答案选择, 动态注意力机制, 多角度匹配, 预训练模型

Abstract: Abstract: Since the current mainstream neural network can not satisfy the full expression of sentences and the full interaction of information between sentences at the same time when processing sentence pairs, a dynamic attention and multi-perspective matching method (DAMPM) is proposed. First, the word embedding method of language model (ELMo) is called to obtain the word vector containing simple semantic information, and then the filtering mechanism is used in the attention layer to effectively remove the noise in the sentence, and the sentence representation of question and answer sentences is obtained. Secondly, a variety of matching strategies are introduced in the matching layer to complete the information interaction between sentence vectors. Finally, the sentence vectors obtained from the matching layer are spliced by BiLSTM and calculate the sentence similarity. The experimental results on the text retrieval conference question answering (TRECQA) data set show that compared with the method of Dynamic-clip Attention Network (DCAN), which is the best algorithm in the comparison aggregation framework of the baseline model, DAMPM improves 1.6% on the mean average precision (MAP) and mean reciprocal rank (MRR). Secondly, the experimental results on the wiki question answering (WIKIQA) data set show that DAMPM improves the performance of DCAN by 0.7% and 0.3% respectively. Experimental results show that DAMPM has better performance than baseline model overall.

Key words: neural network, answer selection, dynamic attention mechanism, multi-perspective matching, pre-trained model

中图分类号: