Journal of Computer Applications ›› 2021, Vol. 41 ›› Issue (6): 1647-1651.DOI: 10.11772/j.issn.1001-9081.2020091375

Special Issue: 人工智能

• Artificial intelligence • Previous Articles     Next Articles

Application of Transformer optimized by pointer generator network and coverage loss in field of abstractive text summarization

LI Xiang, WANG Weibing, SHANG Xueda   

  1. School of Computer Science and Technology, Harbin University of Science and Technology, Harbin Heilongjiang 150080, China
  • Received:2020-09-07 Revised:2020-12-10 Online:2021-06-10 Published:2021-01-26
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (61673142).

指针生成网络和覆盖损失优化的Transformer在生成式文本摘要领域的应用

李想, 王卫兵, 尚学达   

  1. 哈尔滨理工大学 计算机科学与技术学院, 哈尔滨 150080
  • 通讯作者: 王卫兵
  • 作者简介:李想(1994-),男,黑龙江绥化人,硕士研究生,主要研究方向:数据挖掘、自然语言处理;王卫兵(1964-),男,湖北武汉人,教授,博士,CCF会员,主要研究方向:计算机控制、智能信息处理;尚学达(1995-),男,河南新乡人,硕士研究生,主要研究方向:机器学习、自然语言处理。
  • 基金资助:
    国家自然科学基金资助项目(61673142)。

Abstract: Aiming at the application scenario of abstractive text summarization, a Transformer-based summarization model with Pointer Generator network and Coverage Loss added to the Transformer model for optimization was proposed. First, the method based on the Transformer model as the basic structure was proposed, and its attention mechanism was used to better capture the semantic information of the context. Then, the Coverage Loss was introduced into the loss function of the model to punish the distribution and coverage of repeated words, so as to solve the problem that the attention mechanism in the Transformer model continuously generates the same word in abstractive tasks. Finally, the Pointer Generator network was added to the model, which allowed the model to copy words from the source text as generated words to solve the Out of Vocabulary (OOV) problem. Whether the improved model reduced inaccurate expressions and whether the phenomenon of repeated occurrence of the same word was solved were explored. Compared with the original Transformer model, the improved model improved the score on ROUGE-L evaluation function by 1.98 percentage points, the score on ROUGE-2 evaluation function by 0.95 percentage points, and the score on ROUGE-L evaluation function by 2.27 percentage points, and improved the readability and accuracy of the summarization results. Experimental results show that Transformer can be applied to the field of abstractive text summarization after adding Coverage Loss and Pointer Generator network.

Key words: abstractive text summarization, attention mechanism, Transformer, coverage loss, pointer generator network

摘要: 针对生成式文本摘要应用场景,提出了以Transformer为基础的摘要模型,并在Transformer模型中加入了指针生成(Pointer Generator)网络和覆盖损失(Coverage Loss)进行优化。首先,提出了基于Transformer模型作为基础结构的方法,利用其注意力机制更好地捕捉上下文的语意信息。然后,在模型的损失函数中引入Coverage Loss来惩罚不断出现的重复的词的分布和覆盖范围,从而解决Transformer模型中的注意力机制在生成式任务中出现不断生成同一个词的问题。最后,在模型中加入了Pointer Generator网络,从而允许模型从源文本中复制词用作生成词来解决词表无法覆盖(OOV)的问题。探索了改进后的模型是否减少了不准确的表达以及重复出现相同词的现象是否得以解决。该模型相较于原始的Transformer模型在ROUGE-1评测函数上得分提升了1.98个百分点、ROUGE-2评测函数上得分提升0.95个百分点,在ROUGE-L评测函数上得分提升了2.27个百分点,并提升了摘要结果的可读性及准确性。实验结果表明,Transformer在加入Coverage Loss和Pointer Generator网络后可应用于生成式文本摘要领域。

关键词: 生成式文本摘要, 注意力机制, Transformer, 覆盖损失, 指针生成网络

CLC Number: