《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (7): 1979-1984.DOI: 10.11772/j.issn.1001-9081.2021050719

• 人工智能 •    

基于事件表示的机器阅读理解模型

王元龙(), 刘晓敏, 张虎   

  1. 山西大学 计算机与信息技术学院,太原 030006
  • 收稿日期:2021-05-07 修回日期:2022-02-21 接受日期:2022-02-25 发布日期:2022-03-15 出版日期:2022-07-10
  • 通讯作者: 王元龙
  • 作者简介:王元龙(1983—),男,山西大同人,副教授,博士,CCF会员,主要研究方向:自然语言处理、机器学习
    刘晓敏(2000—),女,山西朔州人,硕士研究生,主要研究方向:自然语言处理
    张虎(1979—),男,山西大同人,副教授,博士,CCF会员,主要研究方向:自然语言处理。
  • 基金资助:
    国家自然科学基金资助项目(61806117)

Machine reading comprehension model based on event representation

Yuanlong WANG(), Xiaomin LIU, Hu ZHANG   

  1. School of Computer and Information Technology,Shanxi University,Taiyuan Shanxi 030006,China
  • Received:2021-05-07 Revised:2022-02-21 Accepted:2022-02-25 Online:2022-03-15 Published:2022-07-10
  • Contact: Yuanlong WANG
  • About author:WANG Yuanlong, born in 1983, Ph. D., associate professor. His research interests include natural language processing, machine learning.
    LIU Xiaomin, born in 2000, M. S. candidate. Her research interests include natural language processing.
    ZHANG Hu, born in 1979, Ph. D., associate professor. His research interests include natural language processing.
  • Supported by:
    National Natural Science Foundation of China(61806117)

摘要:

要真正理解一段语篇,在阅读理解过程对原文主旨线索的把握是非常重要的。针对机器阅读理解中主旨线索类型的问题,提出了基于事件表示的机器阅读理解分析方法。首先,通过线索短语从阅读材料中抽取篇章事件图,其中包括事件的表示、事件要素的抽取和事件关系的抽取等;然后,综合考虑事件的时间要素、情感要素以及每个词在文档中的重要性,采用TextRank算法选出线索相关的事件;最后,依据所选出的线索事件构建问题的答案。在收集了339道线索类题组成的测试集上,实验结果表明所提方法在BLEU和CIDEr评价指标上与基于TextRank算法的句子排序方法相比均有所提升,具体来说,BLEU-4指标提升了4.1个百分点,CIDEr指标提升了9个百分点。

关键词: 自然语言处理, 阅读理解, 主旨线索类型问题, 事件表示, 篇章事件图

Abstract:

In order to truly understand a piece of text, it is very important to grasp the main clues of the original text in the process of reading comprehension. Aiming at the questions of main clues in machine reading comprehension, a machine reading comprehension method based on event representation was proposed. Firstly, the textual event graph including the representation of events, the extraction of event elements and the extraction of event relations was extracted from the reading material by clue phrases. Secondly, after considering the time elements, emotional elements of events and the importance of each word in the document, the TextRank algorithm was used to select the events related to the clues. Finally, the answers of the questions were constructed based on the selected clue events. Experimental results show that on the test set composed of the collected 339 questions of clues, the proposed method is better than the sentence ranking method based on TextRank algorithm on BiLingual Evaluation Understudy (BLEU) and Consensus-based Image Description Evaluation (CIDEr) evaluation indexes. In specific, BLEU-4 index is increased by 4.1 percentage points and CIDEr index is increased by 9 percentage points.

Key words: natural language processing, reading comprehension, question of main clues, event representation, textual event graph

中图分类号: