• •    

WISA2023+10 面向机器阅读理解的边界感知方法

刘青1,陈艳平2,邹安琪1,黄瑞章1,秦永彬1   

  1. 1. 贵州大学
    2. 贵州大学计算机科学与技术学院
  • 收稿日期:2023-09-01 修回日期:2023-09-06 发布日期:2023-12-18
  • 通讯作者: 陈艳平
  • 基金资助:
    国家自然科学基金资助项目;贵州省科技支撑计划项目

Boundary-aware approach to machine reading comprehension

  • Received:2023-09-01 Revised:2023-09-06 Online:2023-12-18

摘要: 在片段抽取式机器阅读理解任务中,基于预训练语言模型微调的方法普遍通过预测答案的开始边界和结束边界以获取答案。这种答案获取方式会导致模型出现预测边界不够准确的问题。为缓解该问题,该文提出了一种边界感知的方法。该方法对问题边界进行强调,将感知的问题边界语义信息融入到答案边界回归器中,实现了对偏差的预测边界的进一步调整。该方法能有效增强模型对问题边界信息的感知并且实现对预测答案边界的校准。在公共数据集SQuAD1.1、HotpotQA数据集和NewsQA数据集上的实验证实了本文提出方法的有效性,并通过消融实验验证了该方法中每一个模块的必要性。

关键词: 机器阅读理解, 问题边界感知, 答案边界回归, 片段抽取, SQuAD1.1

Abstract: In span extraction machine reading comprehension tasks, methods based on fine-tuning of pre-trained language models have excellent performance. However, such methods commonly use spliced questions and contexts as input and predict start and end boundaries by fine-tuning to obtain answers directly, leading to the problem that the models do not predict boundaries accurately enough. To alleviate this problem, a boundary-aware approach is proposed in this paper. The method emphasises the question boundaries and incorporates the perceived semantic information of the question boundaries into the answer boundary regressor, enabling further adjustment of the prediction boundaries for deviations. The method effectively enhances the model’s perception of question boundary information and enables the calibration of predicted answer boundaries. Experiments on the public datasets SQuAD1.1、HotpotQA and NewsQA datasets confirm the effectiveness of the proposed method and validate the need for each module of the method through ablation experiments.

Key words: machine reading comprehension, question boundary-aware, answer boundary regression, span extraction, SQuAD1.1

中图分类号: