基于动态注意力和多角度匹配的答案选择

• •

基于动态注意力和多角度匹配的答案选择

李志超¹,吐尔地·托合提¹,艾斯卡尔?艾木都拉²

1. 新疆大学信息科学与工程学院
2. 新疆大学信息科学与工程学院,乌鲁木齐 830046

收稿日期:2021-01-07 修回日期:2021-05-24 发布日期:2021-05-24
通讯作者: 吐尔地·托合提

Answer Selection with Dynamic Attention and Multi-perspective Matching

Received:2021-01-07 Revised:2021-05-24 Online:2021-05-24
Contact: TUERDI Tuoheti

摘要/Abstract

摘要： 针对当前主流神经网络在处理答案选择任务时无法同时满足句子的充分表示以及句子间信息充分交互的问题，提出了一种动态注意力以及多角度匹配的方法（DAMPM）。先调用语言模型的词嵌入方法（ELMo预训练模型)获得包含简单语义信息的词向量，然后在注意力层采用过滤机制有效的去除句子中的噪音，得到问句和答案句的句子表征。其次，在匹配层同时引入多种匹配策略完成句子向量之间的信息交互。最后采用双向长短期记忆网络（BiLSTM）将匹配层获得的句子向量进行拼接后计算句子相似度。在文本检索会议问答（TRECQA）数据集上的实验结果表明，与基线模型中基于比较聚合框架中的最优算法---动态滑动注意力算法（DCAN）相比，DAMPM在平均准确率均值（MAP）和平均倒排名（MRR）两个性能指标上均提高了1.6%。其次，在维基百科问答（WIKIQA）数据集上的实验结果表明，DAMPM相比DCAN在两个性能指标上分别提高了0.7%和0.3%。总的实验结果表明，DAMPM相比基线模型整体上有更好的性能表现。

关键词: 神经网络, 答案选择, 动态注意力机制, 多角度匹配, 预训练模型

Abstract: Abstract: Since the current mainstream neural network can not satisfy the full expression of sentences and the full interaction of information between sentences at the same time when processing sentence pairs, a dynamic attention and multi-perspective matching method (DAMPM) is proposed. First, the word embedding method of language model (ELMo) is called to obtain the word vector containing simple semantic information, and then the filtering mechanism is used in the attention layer to effectively remove the noise in the sentence, and the sentence representation of question and answer sentences is obtained. Secondly, a variety of matching strategies are introduced in the matching layer to complete the information interaction between sentence vectors. Finally, the sentence vectors obtained from the matching layer are spliced by BiLSTM and calculate the sentence similarity. The experimental results on the text retrieval conference question answering (TRECQA) data set show that compared with the method of Dynamic-clip Attention Network (DCAN), which is the best algorithm in the comparison aggregation framework of the baseline model, DAMPM improves 1.6% on the mean average precision (MAP) and mean reciprocal rank (MRR). Secondly, the experimental results on the wiki question answering (WIKIQA) data set show that DAMPM improves the performance of DCAN by 0.7% and 0.3% respectively. Experimental results show that DAMPM has better performance than baseline model overall.

Key words: neural network, answer selection, dynamic attention mechanism, multi-perspective matching, pre-trained model

中图分类号:

TP391

李志超吐尔地·托合提艾斯卡尔?艾木都拉. 基于动态注意力和多角度匹配的答案选择[J]. 计算机应用.

[1]	郭洁, 林佳瑜, 梁祖红, 罗孝波, 孙海涛. 基于知识感知和跨层次对比学习的推荐方法[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1121-1127.
[2]	王杰, 孟华. 基于点云整体拓扑结构的图像分类算法[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1107-1113.
[3]	陈天华, 朱家煊, 印杰. 基于注意力机制的鸟类识别算法[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1114-1120.
[4]	许立君, 黎辉, 刘祖阳, 陈侃松, 马为駽. 基于3D‑Ghost卷积神经网络的脑胶质瘤MRI图像分割算法3D‑GA‑Unet[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1294-1302.
[5]	肖斌, 杨模, 汪敏, 秦光源, 李欢. 独立性视角下的相频融合领域泛化方法[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1002-1009.
[6]	侯瑞峰, 张鹏程, 张丽媛, 桂志国, 刘祎, 张浩文, 王书斌. 基于全变分正则项展开的迭代去噪网络[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 916-921.
[7]	佘维, 李阳, 钟李红, 孔德锋, 田钊. 基于改进实数编码遗传算法的神经网络超参数优化[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 671-676.
[8]	周景贤, 李希娜. 基于改进卷积神经网络和射频指纹的无人机检测与识别[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 876-882.
[9]	董永峰, 白佳明, 王利琴, 王旭. 融合先验知识和字形特征的中文命名实体识别[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 702-708.
[10]	徐大鹏, 侯新民. 基于网络结构设计的图神经网络特征选择方法[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 663-670.
[11]	余杭, 周艳玲, 翟梦鑫, 刘涵. 基于预训练模型与标签融合的文本分类[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 709-714.
[12]	荆智文, 张屿佳, 孙伯廷, 郭浩. 二阶段孪生图卷积神经网络推荐算法[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 469-476.
[13]	张睿, 宋思琪, 胡静, 张永梅, 柴艳峰. 基于统计和自适应ParNet的产学研绩效评价[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 628-637.
[14]	张家伟, 高冠东, 肖珂, 宋胜尊. 基于改进分层注意网络和TextCNN联合建模的暴力犯罪分级算法[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 403-410.
[15]	王星, 刘贵娟, 陈志豪. 高斯混合模型与文本图卷积网络结合的虚假评论识别算法[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 360-368.