Reinforced automatic summarization model based on advantage actor-critic algorithm

doi:10.11772/j.issn.1001-9081.2020060837

Abstract

Abstract: The extractive summary model is relatively redundant and the abstractive summary model often loses key information and has inaccurate summary and repeated generated content in long text automatic summarization task. In order to solve these problems, a Reinforced Automatic Summarization model based on Advantage Actor-Critic algorithm (A2C-RLAS) for long text was proposed. Firstly, the key sentences of the original text were extracted by the extractor based on the hybrid neural network of Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN). Then, the key sentences were refined by the rewriter based on the copy mechanism and the attention mechanism. Finally, the Advantage Actor-Critic (A2C) algorithm in reinforcement learning was used to train the entire network, and the semantic similarity between the rewritten summary and the reference summary (BERTScore (Evaluating Text Generation with Bidirectional Encoder Representations from Transformers) value) was used as a reward to guide the extraction process, so as to improve the quality of sentences extracted by the extractor. The experimental results on CNN/Daily Mail dataset show that, compared with models such as Reinforcement Learning-based Extractive Summarization (Refresh) model, a Recurrent Neural Network based sequence model for extractive summarization (SummaRuNNer) and Distributional Semantics Reward (DSR) model, the A2C-RLAS has the final summary with content more accurate, language more fluent and redundant content effectively reduced, at the same time, A2C-RLAS has both the ROUGE (Recall-Oriented Understudy for Gisting Evaluation) and BERTScore indicators improved. Compared to the Refresh model and the SummaRuNNer model, the ROUGE-L value of the A2C-RLAS model is increased by 6.3% and 10.2% respectively; compared with the DSR model, the F1 value of the A2C-RLAS model is increased by 30.5%.

Key words: automatic summary model, extractive summary model, abstractive summary model, encoder-decoder, reinforcement learning, Advantage Actor-Critic (A2C) algorithm

摘要： 针对长文本自动摘要任务中抽取式模型摘要较为冗余，而生成式摘要模型时常有关键信息丢失、摘要不准确和生成内容重复等问题，提出一种面向长文本的基于优势演员-评论家算法的强化自动摘要模型（A2C-RLAS）。首先，用基于卷积神经网络（CNN）和循环神经网络（RNN）的混合神经网络的抽取器（extractor）来提取原文关键句；然后，用基于拷贝机制和注意力机制的重写器（rewriter）来精炼关键句；最后，使用强化学习的优势演员-评论家（A2C）算法训练整个网络，把重写摘要和参考摘要的语义相似性（BERTScore值）作为奖励（reward）来指导抽取过程，从而提高抽取器提取句子的质量。在CNN/Daily Mail数据集上的实验结果表明，与基于强化学习的抽取式摘要（Refresh）模型、基于循环神经网络的抽取式摘要序列模型（SummaRuNNer）和分布语义奖励（DSR）模型等模型相比，A2C-RLAS的最终摘要内容更加准确、语言更加流畅，冗余的内容有效减少，且A2C-RLAS的ROUGE和BERTScore指标均有提升。相较于Refresh模型和SummaRuNNer模型，A2C-RLAS模型的ROUGE-L值分别提高了6.3%和10.2%；相较于DSR模型，A2C-RLAS模型的F1值提高了30.5%。

关键词: 自动摘要模型, 抽取式摘要模型, 生成式摘要模型, 编码器-解码器, 强化学习, 优势演员-评论家算法

CLC Number:

TP391.1

DU Xixi, CHENG Hua, FANG Yiquan. Reinforced automatic summarization model based on advantage actor-critic algorithm[J]. Journal of Computer Applications, 2021, 41(3): 699-705.

杜嘻嘻, 程华, 房一泉. 基于优势演员-评论家算法的强化自动摘要模型[J]. 计算机应用, 2021, 41(3): 699-705.

References

[1] 黄波, 刘传才. 基于加权TextRank的中文自动文本摘要[J]. 计算机应用研究,2020,37(2):407-410.(HUANG B,LIU C C. Chinese automatic text summarization based on weighted TextRank[J]. Application Research of Computers,2020,37(2):407-410.)
[2] GAMBHIR M,GUPTA V. Recent automatic text summarization techniques:a survey[J]. Artificial Intelligence Review,2017,47(1):1-66.
[3] ERKAN G,RADEV D R. LexRank:graph-based lexical centrality as salience in text summarization[J]. Journal of Artificial Intelligence Research,2004,22(1):457-479.
[4] RUSH A M,CHOPRA S,WESTON J. A neural attention model for abstractive sentence summarization[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA:Association for Computational Linguistics, 2015:379-389.
[5] SHARMA E,HUANG L,HU Z,et al. An entity-driven framework for abstractive summarization[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing/the 9th International Joint Conference on Natural Language Processing. Stroudsburg, PA:Association for Computational Linguistics,2019:3280-3291.
[6] ZHANG J,ZHAO Y,SALEH M,et al. PEGASUS:pre-training with extracted gap-sentences for abstractive summarization[C]//Proceedings of the 37th International Conference on Machine Learning. La Jolla,CA:International Machine Learning Society, 2020:11328-11339.
[7] SAITO I, NISHIDA K, NISHIDA K, et al. Abstractive summarization with combination of pre-trained sequence-tosequence and saliency models[EB/OL].[2020-04-12]. https://arxiv.org/pdf/2003.13028v1.pdf.
[8] LI S,LEI D,QIN P,et al. Deep reinforcement learning with distributional semantic rewards for abstractive summarization[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing/the 9th International Joint Conference on Natural Language Processing. Stroudsburg,PA:Association for Computational Linguistics,2019:6038-644.
[9] CHENG H,CAI J,FANG Y. RL-Gen:a character-level text generation framework with reinforcement learning in domain generation algorithm case[C]//Proceedings of the 26th International Conference on Neural Information Processing,CCIS 1143. Cham:Springer,2019:690-697.
[10] ZHANG T,KISHORE V,WU F,et al. BERTScore:evaluating text generation with BERT[EB/OL].[2020-03-07]. https://arxiv.org/pdf/1904.09675v3.pdf.
[11] HERMANN K M,KOČISKÝ T,GREFENSTETTE E,et al. Teaching machines to read and comprehend[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems. Cambridge:MIT Press,2015:1693-1701.
[12] LIN C Y. ROUGE:a package for automatic evaluation of summaries[C]//Text Summarization Branches Out:Proceedings of the ACL-04 Workshop. Stroudsburg, PA:Association for Computational Linguistics,2004:74-81.
[13] NARAYAN S,COHEN S B,LAPATA M. Ranking sentences for extractive summarization with reinforcement learning[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies. Stroudsburg,PA:Association for Computational Linguistics,2018:1747-1759.
[14] CHEN Y,MA Y,MAO X,et al. Abstractive summarization with the aid of extractive summarization[C]//Proceedings of the 2018 Asia-Pacific Web (APWeb) and Web-Age Information Management(WAIM)Joint International Conference on Web and Big Data,LNCS 10987. Cham:Springer,2018:3-15.
[15] KIM Y. Convolutional neural networks for sentence classification[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg,PA:Association for Computational Linguistics,2004:1746-1751.
[16] CHUNG J,GULCEHRE C,CHO K,et al. Empirical evaluation of gated recurrent neural networks on sequence modeling[EB/OL].[2018-09-07]. https://arxiv.org/pdf/1412.3555v1.pdf.
[17] VINYALS O,FORTUNATO M,JAITLY N. Pointer networks[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems. Cambridge:MIT Press,2015:2692-2700.
[18] BAHDANAU D,CHO K,BENGIO Y. Neural machine translation by jointly learning to align and translate[EB/OL].[2018-09-20]. https://arxiv.org/pdf/1409.0473v7.pdf.
[19] GU J,LU Z,LI H,et al. Incorporating copying mechanism in sequence-to-sequence learning[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA:Association for Computational Linguistics, 2016:1631-1640.
[20] WILLIAMS R J. Simple statistical gradient-following algorithms for connectionist reinforcement learning[J]. Machine Learning, 1992,8(3/4):229-256.
[21] MIKOLOV T,CHEN K,CORRADO G,et al. Efficient estimation of word representations in vector space[EB/OL].[2018-07-08]. https://arxiv.org/pdf/1301.3781v3.pdf.
[22] KINGMA D P, BA J L. Adam:a method for stochastic optimization[EB/OL].[2018-07-21]. https://arxiv.org/pdf/1412.6980v9.pdf.
[23] LIU Y,TITOV I,LAPATA M. Single document summarization as tree induction[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies. Stroudsburg,PA:Association for Computational Linguistics,2019:1745-1755.
[24] SEE A, LIU P J, MANNING C D. Get to the point:summarization with pointer-generator networks[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA:Association for Computational Linguistics,2017:1073-1083.
[25] NALLAPATI R,ZHAI F,ZHOU B. SummaRuNNer:a recurrent neural network based sequence model for extractive summarization of document[C]//Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence. Palo Alto,CA:AAAI Press,2017:3075-3081.
[26] BAE S,KIM T,KIM J,et al. Summary level training of sentence rewriting for abstractive summarization[C]//Proceedings of the 2nd Workshop on New Frontiers in Summarization. Stroudsburg, PA:Association for Computational Linguistics,2019:10-20.
[27] HAN S,LIN X,JOTY S. Resurrecting submodularity in neural abstractive summarization[EB/OL].[2020-10-10]. https://arxiv.org/pdf/1911.03014v3.pdf.