基于优势演员-评论家算法的强化自动摘要模型

doi:10.11772/j.issn.1001-9081.2020060837

计算机应用 ›› 2021, Vol. 41 ›› Issue (3): 699-705.DOI: 10.11772/j.issn.1001-9081.2020060837

所属专题：人工智能

基于优势演员-评论家算法的强化自动摘要模型

杜嘻嘻, 程华, 房一泉

华东理工大学信息科学与工程学院, 上海 200237

收稿日期:2020-06-17 修回日期:2020-10-08 发布日期:2021-01-15 出版日期:2021-03-10
通讯作者: 程华
作者简介:杜嘻嘻(1994-),女,安徽亳州人,硕士研究生,主要研究方向:自然语言处理;程华(1975-),男,安徽黄山人,副研究员,博士,主要研究方向:智能信号处理、信息安全;房一泉(1975-),女,上海人,高级实验师,硕士,主要研究方向:信息安全。
基金资助:
赛尔网络下一代互联网技术创新项目（NGII20170520）。

Reinforced automatic summarization model based on advantage actor-critic algorithm

DU Xixi, CHENG Hua, FANG Yiquan

Institute of Information Science and Engineering, East China University of Science and Technology, Shanghai 200237, China

Received:2020-06-17 Revised:2020-10-08 Online:2021-01-15 Published:2021-03-10
Supported by:
This work is partially supported by the CERNET Innovation Project (NGII20170520).

摘要/Abstract

摘要： 针对长文本自动摘要任务中抽取式模型摘要较为冗余，而生成式摘要模型时常有关键信息丢失、摘要不准确和生成内容重复等问题，提出一种面向长文本的基于优势演员-评论家算法的强化自动摘要模型（A2C-RLAS）。首先，用基于卷积神经网络（CNN）和循环神经网络（RNN）的混合神经网络的抽取器（extractor）来提取原文关键句；然后，用基于拷贝机制和注意力机制的重写器（rewriter）来精炼关键句；最后，使用强化学习的优势演员-评论家（A2C）算法训练整个网络，把重写摘要和参考摘要的语义相似性（BERTScore值）作为奖励（reward）来指导抽取过程，从而提高抽取器提取句子的质量。在CNN/Daily Mail数据集上的实验结果表明，与基于强化学习的抽取式摘要（Refresh）模型、基于循环神经网络的抽取式摘要序列模型（SummaRuNNer）和分布语义奖励（DSR）模型等模型相比，A2C-RLAS的最终摘要内容更加准确、语言更加流畅，冗余的内容有效减少，且A2C-RLAS的ROUGE和BERTScore指标均有提升。相较于Refresh模型和SummaRuNNer模型，A2C-RLAS模型的ROUGE-L值分别提高了6.3%和10.2%；相较于DSR模型，A2C-RLAS模型的F1值提高了30.5%。

关键词: 自动摘要模型, 抽取式摘要模型, 生成式摘要模型, 编码器-解码器, 强化学习, 优势演员-评论家算法

Abstract: The extractive summary model is relatively redundant and the abstractive summary model often loses key information and has inaccurate summary and repeated generated content in long text automatic summarization task. In order to solve these problems, a Reinforced Automatic Summarization model based on Advantage Actor-Critic algorithm (A2C-RLAS) for long text was proposed. Firstly, the key sentences of the original text were extracted by the extractor based on the hybrid neural network of Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN). Then, the key sentences were refined by the rewriter based on the copy mechanism and the attention mechanism. Finally, the Advantage Actor-Critic (A2C) algorithm in reinforcement learning was used to train the entire network, and the semantic similarity between the rewritten summary and the reference summary (BERTScore (Evaluating Text Generation with Bidirectional Encoder Representations from Transformers) value) was used as a reward to guide the extraction process, so as to improve the quality of sentences extracted by the extractor. The experimental results on CNN/Daily Mail dataset show that, compared with models such as Reinforcement Learning-based Extractive Summarization (Refresh) model, a Recurrent Neural Network based sequence model for extractive summarization (SummaRuNNer) and Distributional Semantics Reward (DSR) model, the A2C-RLAS has the final summary with content more accurate, language more fluent and redundant content effectively reduced, at the same time, A2C-RLAS has both the ROUGE (Recall-Oriented Understudy for Gisting Evaluation) and BERTScore indicators improved. Compared to the Refresh model and the SummaRuNNer model, the ROUGE-L value of the A2C-RLAS model is increased by 6.3% and 10.2% respectively; compared with the DSR model, the F1 value of the A2C-RLAS model is increased by 30.5%.

Key words: automatic summary model, extractive summary model, abstractive summary model, encoder-decoder, reinforcement learning, Advantage Actor-Critic (A2C) algorithm

中图分类号:

TP391.1

杜嘻嘻, 程华, 房一泉. 基于优势演员-评论家算法的强化自动摘要模型[J]. 计算机应用, 2021, 41(3): 699-705.

DU Xixi, CHENG Hua, FANG Yiquan. Reinforced automatic summarization model based on advantage actor-critic algorithm[J]. Journal of Computer Applications, 2021, 41(3): 699-705.

参考文献

[1] 黄波, 刘传才. 基于加权TextRank的中文自动文本摘要[J]. 计算机应用研究,2020,37(2):407-410.(HUANG B,LIU C C. Chinese automatic text summarization based on weighted TextRank[J]. Application Research of Computers,2020,37(2):407-410.)
[2] GAMBHIR M,GUPTA V. Recent automatic text summarization techniques:a survey[J]. Artificial Intelligence Review,2017,47(1):1-66.
[3] ERKAN G,RADEV D R. LexRank:graph-based lexical centrality as salience in text summarization[J]. Journal of Artificial Intelligence Research,2004,22(1):457-479.
[4] RUSH A M,CHOPRA S,WESTON J. A neural attention model for abstractive sentence summarization[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA:Association for Computational Linguistics, 2015:379-389.
[5] SHARMA E,HUANG L,HU Z,et al. An entity-driven framework for abstractive summarization[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing/the 9th International Joint Conference on Natural Language Processing. Stroudsburg, PA:Association for Computational Linguistics,2019:3280-3291.
[6] ZHANG J,ZHAO Y,SALEH M,et al. PEGASUS:pre-training with extracted gap-sentences for abstractive summarization[C]//Proceedings of the 37th International Conference on Machine Learning. La Jolla,CA:International Machine Learning Society, 2020:11328-11339.
[7] SAITO I, NISHIDA K, NISHIDA K, et al. Abstractive summarization with combination of pre-trained sequence-tosequence and saliency models[EB/OL].[2020-04-12]. https://arxiv.org/pdf/2003.13028v1.pdf.
[8] LI S,LEI D,QIN P,et al. Deep reinforcement learning with distributional semantic rewards for abstractive summarization[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing/the 9th International Joint Conference on Natural Language Processing. Stroudsburg,PA:Association for Computational Linguistics,2019:6038-644.
[9] CHENG H,CAI J,FANG Y. RL-Gen:a character-level text generation framework with reinforcement learning in domain generation algorithm case[C]//Proceedings of the 26th International Conference on Neural Information Processing,CCIS 1143. Cham:Springer,2019:690-697.
[10] ZHANG T,KISHORE V,WU F,et al. BERTScore:evaluating text generation with BERT[EB/OL].[2020-03-07]. https://arxiv.org/pdf/1904.09675v3.pdf.
[11] HERMANN K M,KOČISKÝ T,GREFENSTETTE E,et al. Teaching machines to read and comprehend[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems. Cambridge:MIT Press,2015:1693-1701.
[12] LIN C Y. ROUGE:a package for automatic evaluation of summaries[C]//Text Summarization Branches Out:Proceedings of the ACL-04 Workshop. Stroudsburg, PA:Association for Computational Linguistics,2004:74-81.
[13] NARAYAN S,COHEN S B,LAPATA M. Ranking sentences for extractive summarization with reinforcement learning[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies. Stroudsburg,PA:Association for Computational Linguistics,2018:1747-1759.
[14] CHEN Y,MA Y,MAO X,et al. Abstractive summarization with the aid of extractive summarization[C]//Proceedings of the 2018 Asia-Pacific Web (APWeb) and Web-Age Information Management(WAIM)Joint International Conference on Web and Big Data,LNCS 10987. Cham:Springer,2018:3-15.
[15] KIM Y. Convolutional neural networks for sentence classification[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg,PA:Association for Computational Linguistics,2004:1746-1751.
[16] CHUNG J,GULCEHRE C,CHO K,et al. Empirical evaluation of gated recurrent neural networks on sequence modeling[EB/OL].[2018-09-07]. https://arxiv.org/pdf/1412.3555v1.pdf.
[17] VINYALS O,FORTUNATO M,JAITLY N. Pointer networks[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems. Cambridge:MIT Press,2015:2692-2700.
[18] BAHDANAU D,CHO K,BENGIO Y. Neural machine translation by jointly learning to align and translate[EB/OL].[2018-09-20]. https://arxiv.org/pdf/1409.0473v7.pdf.
[19] GU J,LU Z,LI H,et al. Incorporating copying mechanism in sequence-to-sequence learning[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA:Association for Computational Linguistics, 2016:1631-1640.
[20] WILLIAMS R J. Simple statistical gradient-following algorithms for connectionist reinforcement learning[J]. Machine Learning, 1992,8(3/4):229-256.
[21] MIKOLOV T,CHEN K,CORRADO G,et al. Efficient estimation of word representations in vector space[EB/OL].[2018-07-08]. https://arxiv.org/pdf/1301.3781v3.pdf.
[22] KINGMA D P, BA J L. Adam:a method for stochastic optimization[EB/OL].[2018-07-21]. https://arxiv.org/pdf/1412.6980v9.pdf.
[23] LIU Y,TITOV I,LAPATA M. Single document summarization as tree induction[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies. Stroudsburg,PA:Association for Computational Linguistics,2019:1745-1755.
[24] SEE A, LIU P J, MANNING C D. Get to the point:summarization with pointer-generator networks[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA:Association for Computational Linguistics,2017:1073-1083.
[25] NALLAPATI R,ZHAI F,ZHOU B. SummaRuNNer:a recurrent neural network based sequence model for extractive summarization of document[C]//Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence. Palo Alto,CA:AAAI Press,2017:3075-3081.
[26] BAE S,KIM T,KIM J,et al. Summary level training of sentence rewriting for abstractive summarization[C]//Proceedings of the 2nd Workshop on New Frontiers in Summarization. Stroudsburg, PA:Association for Computational Linguistics,2019:10-20.
[27] HAN S,LIN X,JOTY S. Resurrecting submodularity in neural abstractive summarization[EB/OL].[2020-10-10]. https://arxiv.org/pdf/1911.03014v3.pdf.

基于优势演员-评论家算法的强化自动摘要模型

Reinforced automatic summarization model based on advantage actor-critic algorithm

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

[1]	肖海林, 黄天义, 代秋香, 张跃军, 张中山. 基于轨迹预测的安全强化学习自动变道决策方法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2958-2963.
[2]	徐志刚, 张创. 基于门控位置编码的壁画图像多级色彩还原[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2931-2937.
[3]	何浩东, 符浩, 王强, 周帅, 刘伟. 基于深度强化学习的多机器人路径跟随与编队[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2626-2633.
[4]	周毅, 高华, 田永谌. 基于裁剪优化和策略指导的近端策略优化算法[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2334-2341.
[5]	马天, 席润韬, 吕佳豪, 曾奕杰, 杨嘉怡, 张杰慧. 基于深度强化学习的移动机器人三维路径规划方法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2055-2064.
[6]	赵晓焱, 韩威, 张俊娜, 袁培燕. 基于异步深度强化学习的车联网协作卸载策略[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1501-1510.
[7]	唐睿, 庞川林, 张睿智, 刘川, 岳士博. D2D通信增强的蜂窝网络中基于DDPG的资源分配[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1562-1569.
[8]	陈发堂, 黄淼, 金宇峰. 面向用户需求的低轨卫星资源分配算法[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1242-1247.
[9]	秦鑫彤, 宋政育, 侯天为, 王飞越, 孙昕, 黎伟. 基于自适应p持续的移动自组网信道接入和资源分配算法[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 863-868.
[10]	李源潮, 陶重犇, 王琛. 基于最大熵深度强化学习的双足机器人步态控制方法[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 445-451.
[11]	宋紫阳, 李军怀, 王怀军, 苏鑫, 于蕾. 基于路径模仿和SAC强化学习的机械臂路径规划算法[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 439-444.
[12]	邓辅秦, 官桧锋, 谭朝恩, 付兰慧, 王宏民, 林天麟, 张建民. 基于请求与应答通信机制和局部注意力机制的多机器人强化学习路径规划方法[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 432-438.
[13]	余家宸, 杨晔. 基于裁剪近端策略优化算法的软机械臂不规则物体抓取[J]. 《计算机应用》唯一官方网站, 2024, 44(11): 3629-3638.
[14]	王昱, 关智慧, 李远鹏. 基于轨迹预测和分布式MADDPG的无人机集群追击决策[J]. 《计算机应用》唯一官方网站, 2024, 44(11): 3623-3628.
[15]	龙杰, 谢良, 徐海蛟. 集成的深度强化学习投资组合模型[J]. 《计算机应用》唯一官方网站, 2024, 44(1): 300-310.