Journal of Computer Applications ›› 2022, Vol. 42 ›› Issue (1): 64-70.DOI: 10.11772/j.issn.1001-9081.2021020335

• Artificial intelligence • Previous Articles     Next Articles

Three-stage question answering model based on BERT

Yu PENG, Xiaoyu LI(), Shijie HU, Xiaolei LIU, Weizhong QIAN   

  1. School of Information and Software Engineering,University of Electronic Science and Technology of China,Chengdu Sichuan 610054,China
  • Received:2021-03-08 Revised:2021-05-12 Accepted:2021-05-17 Online:2021-05-24 Published:2022-01-10
  • Contact: Xiaoyu LI
  • About author:HU Shijie, born in 1998, M. S. candidate. His research interests include deep learning, natural language processing.
    LIU Xiaolei, born in 1996, M. S. candidate. His research interests include deep learning, generative adversarial network, computer vision.
    QIAN Weizhong, born in 1976, Ph. D, associate professor. His research interests include quantum machine learning, blockchain.
  • Supported by:
    Science and Technology Project of Sichuan Province (Key Research and Development Program)(19ZDYF0794)

基于BERT的三阶段式问答模型

彭宇, 李晓瑜(), 胡世杰, 刘晓磊, 钱伟中   

  1. 电子科技大学 信息与软件工程学院,成都 610054
  • 通讯作者: 李晓瑜
  • 作者简介:彭宇(1996—),男,四川眉山人,硕士研究生,主要研究方向:深度学习、自然语言处理
    李晓瑜(1985—),女,山东菏泽人,副教授,博士,CCF会员,主要研究方向:机器学习、数据分析、量子机器学习
    胡世杰(1998—),男,江西抚州人,硕士研究生,主要研究方向:深度学习、自然语言处理
    刘晓磊(1996—),男,山东烟台人,硕士研究生,主要研究方向:深度学习、生成对抗网络、计算机视觉
    钱伟中(1976—),男,江苏无锡人,副教授,博士,主要研究方向:量子机器学习、区块链。
  • 基金资助:
    四川省科技计划项目(重点研发项目)(19ZDYF0794)

Abstract:

The development of pre-trained language models has greatly promoted the progress of machine reading comprehension tasks. In order to make full use of shallow features of the pre-trained language model and further improve the accuracy of predictive answer of question answering model, a three-stage question answering model based on Bidirectional Encoder Representation from Transformers (BERT) was proposed. Firstly, the three stages of pre-answering, re-answering and answer-adjusting were designed based on BERT. Secondly, the inputs of embedding layer of BERT were treated as shallow features to pre-generate an answer in pre-answering stage. Then, the deep features fully encoded by BERT were used to re-generate another answer in re-answering stage. Finally, the final prediction result was generated by combining the previous two answers in answer-adjusting stage. Experimental results on English dataset Stanford Question Answering Dataset 2.0 (SQuAD2.0) and Chinese dataset Chinese Machine Reading Comprehension 2018 (CMRC2018) of span-extraction question answering task show that the Exact Match (EM) and F1 score (F1) of the proposed model are improved by the average of 1 to 3 percentage points compared with those of the similar baseline models, and the model has the extracted answer fragments more accurate. By combining shallow features of BERT with deep features, this three-stage model extends the abstract representation ability of BERT, and explores the application of shallow features of BERT in question answering models, and has the characteristics of simple structure, accurate prediction, and fast speed of training and inference.

Key words: Natural Language Processing (NLP), machine reading comprehension, span-extraction question answering, Bidirectional Encoder Representation from Transformers (BERT), deep learning

摘要:

预训练语言模型的发展极大地推动了机器阅读理解任务的进步。为了充分利用预训练语言模型中的浅层特征,并进一步提升问答模型预测答案的准确性,提出了一种基于BERT的三阶段式问答模型。首先,基于BERT设计了预回答、再回答及答案调整三个阶段;然后,在预回答阶段将BERT嵌入层的输入视作浅层特征来进行答案预生成;接着,在再回答阶段使用经BERT充分编码后的深层特征进行答案再生成;最后,在答案调整阶段结合前两个答案产生最终的预测结果。在抽取式问答任务的英文数据集SQuAD2.0和中文数据集CMRC2018上的实验结果显示,该模型在精准匹配度(EM)和F1分数(F1)两个指标上相较于同类基准模型平均提升了1~3个百分点,抽取出的答案片段更加准确。通过融合BERT中的浅层特征与深层特征,该三阶段模型拓展了BERT的抽象表示能力,探索了BERT中的浅层特征在问答模型中的应用,具有结构简单、预测准确、训练和推断速度快等特点。

关键词: 自然语言处理, 机器阅读理解, 抽取式问答, BERT, 深度学习

CLC Number: