Journal of Computer Applications ›› 0, Vol. ›› Issue (): 0-0.DOI: 10.11772/j.issn.1001-9081.2024060893
Next Articles
Received:
Revised:
Online:
Published:
Contact:
方宇涵1,2,杨凡3,张庆4
通讯作者:
基金资助:
Abstract: Aiming at the problems of extraction position bias, answer redundancy and insufficient sample data of pre-trained language model in extractive machine reading comprehension tasks, a machine reading comprehension model integrating dynamic interaction and contrastive learning was proposed. Firstly, the decoding layer of the pre-trained model was improved to the interactive prediction layer, and the dynamic self-attention and dynamic query mechanisms were introduced for answer prediction; secondly, the key positions were selected from the semantic vector output by the pre-trained model through the TopK algorithm, and the features of these positions were enhanced by the multi-head self-attention mechanism; then, the dynamic query vector was calculated based on the enhanced semantic vector and the static query vector, and the answer prediction vector was output; finally, in the loss calculation stage, negative samples were constructed to realize contrastive learning, and the ternary loss was introduced to avoid overfitting. Experimental results show that, on the CMRC2018 (Chinese Machine Reading Comprehension 2018) dataset, compared with the baseline model RoBERTa-wwm-ext-large (Robustly optimized BERT approach with Whole Word Masking extended large) , the F1 value and EM (Exact Match) value of this method are improved by 1.82 and 1.29 percentage points respectively; on the SQuADv1.1 (Stanford Question Answering Dataset version 1.1) English dataset, compared with the baseline model RoBERTa (Robustly optimized BERT approach) , the F1 value and EM value of this method are improved by 1.18 and 0.58 percentage points respectively, which is better than most existing machine reading comprehension models. This verifies the effectiveness and generalization of the proposed algorithm, and can complete more accurate and reliable reading comprehension tasks.
Key words: Machine Reading Comprehension, Pre-trained Models, Span Extraction, Dynamic Interaction, Triplet Loss
摘要: 摘 要: 针对抽取式机器阅读理解任务中存在的抽取位置偏差、答案冗余和预训练语言模型样本数据不足问题,提出了一种融合动态交互和对比学习的机器阅读理解模型。首先,将预训练模型解码层改进为交互预测层,引入动态自注意力和动态查询机制进行答案预测;其次,通过TopK算法从预训练模型输出的语义向量中选择关键位置,经多头自注意力机制增强这些位置的特征;随后,基于增强语义向量与静态查询向量计算出动态查询向量并输出答案预测向量;最后,在损失计算阶段,构建负样本实现对比学习,引入三元损失避免过拟合。实验结果显示,在CMRC2018(Chinese Machine Reading Comprehension 2018)数据集上,与基线模型RoBERTa-wwm-ext-large(Robustly optimized BERT approach with whole word masking extended large)相比,本方法的F1值和EM(Exact Match)值分别提高了1.82和1.29个百分点;在SQuADv1.1(Stanford Question Answering Dataset version 1.1)英文数据集上,与基线模型RoBERTa(Robustly optimized BERT approach)相比,本方法的F1值和EM值分别提高了1.18和0.58个百分点,优于大多数现有机器阅读理解模型,验证了所提算法的有效性和泛化性,可完成更为精准和可靠的阅读理解任务。
关键词: 机器阅读理解, 预训练模型, 片段抽取, 动态交互, 三元损失
CLC Number:
TP391.1
方宇涵 杨凡 张庆. 融合动态交互和对比学习的机器阅读理解模型[J]. 《计算机应用》唯一官方网站, 0, (): 0-0.
0 / Recommend
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2024060893