Scientific paper summarization model using macro discourse structure

doi:10.11772/j.issn.1001-9081.2020121945

Abstract

Abstract: The traditional neural network model cannot reflect the macro discourse structure information between different sections in scientific paper, which leads to the incomplete structure and incoherent content of the generated scientific paper summarization. In order to solve the problem, a scientific paper summarization model using macro discourse structure was proposed. Firstly, a hierarchical encoder based on macro discourse structure was built. Graph convolution neural network was used to encode the macro discourse structure information between sections, so as to construct the hierarchical semantic representation of sections. Then, an information fusion module was proposed to effectively fuse macro discourse structure information and word-level information, so as to assist the decoder to generate the summarization. Finally, the attention mechanism optimization unit was used to update and optimize the context vector. Experimental results show that the proposed model is 3.53, 1.15 and 4.29 percetage points higher than the baseline model in ROUGE (Recall-Oriented Understudy for Gisting Evaluation)-1, ROUGE-2 and ROUGE-L respectively. Through the analysis and comparison of the generated summarization content, it can be further proved that the proposed model can effectively improve the quality of the generated summarization.

Key words: neural network, macro discourse structure, scientific paper summarization, graph convolution neural network, abstractive summarization

摘要： 针对传统的神经网络模型不能较好地反映科技论文内不同章节之间的宏观篇章结构信息，从而容易导致生成的科技论文摘要结构不完整、内容不连贯的问题，提出了一种基于宏观篇章结构的科技论文摘要模型。首先，搭建了一种基于宏观篇章结构的层级编码器，并利用图卷积神经网络对章节间的宏观篇章结构信息进行编码，从而构建章节层级语义表示；然后，提出了一个信息融合模块，旨在将宏观篇章结构信息和单词层级信息进行有效融合，从而辅助解码器生成摘要；最后，利用注意力机制优化单元对上下文向量进行更新优化操作。实验结果表明，所提出的模型比基准模型分别在ROUGE-1、ROUGE-2以及ROUGE-L上分别高出3.53个百分点、1.15个百分点和4.29个百分点，并且通过对生成的摘要内容进行分析对比，可进一步证明该模型可有效提高生成摘要的质量。

关键词: 神经网络, 宏观篇章结构, 科技论文摘要, 图卷积神经网络, 生成式摘要

CLC Number:

TP391

FU Ying, WANG Hongling, WANG Zhongqing. Scientific paper summarization model using macro discourse structure[J]. Journal of Computer Applications, 2021, 41(10): 2864-2870.

付颖, 王红玲, 王中卿. 基于宏观篇章结构的科技论文摘要模型[J]. 计算机应用, 2021, 41(10): 2864-2870.

References

[1] 石文川, 郭桂仙, 李川, 等. 摘要在科技论文中的重要性及写作技巧[J]. 河北农业大学学报, 2000, 23(4):111-113.(SHI W C, GUO G X, LI C, et al. The importance and writing skills of abstract in scientific papers[J]. Journal of Hebei Agricultural University, 2000, 23(4):111-113.)
[2] HERMANN K M, KOČISKÝ T, GREFENSTETTE E, et al. Teaching machines to read and comprehend[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems. Cambridge:MIT Press, 2015:1693-1701.
[3] HUA L F, WAN X J, LI L. Overview of the NLPCC 2017 shared task:single document summarization[C]//Proceedings of the 2017 National CCF Conference on Natural Language Processing and Chinese Computing, LNCS 10619. Cham:Springer, 2017:942-947.
[4] VAN DIJK T A. Macrostructures:An Interdisciplinary Study of Global Structures in Discourse, Interaction, and Cognition[M]. Hillsdale, NJ:Lawrence Erlbaum Associates, Inc., 1980:67-93.
[5] VAN DIJK T A. Text and Context:Explorations in the Semantics and Pragmatics of Discourse[M]. London:Longman, 1977:83-95.
[6] VAN DIJK T A, KINTSCH W. Strategies of Discourse Comprehension[M]. New York:Academic Press, 1983:131-143.
[7] VAN DIJK T A. Handbook of Discourse Analysis[M]. London:Academic Press, 1985:137-157.
[8] VAN DIJK T A. News as Discourse[M]. New York:Routledge, 1990:121-157.
[9] 褚晓敏, 朱巧明, 周国栋. 自然语言处理中的篇章主次关系研究[J]. 计算机学报, 2017, 40(4):842-860.(CHU X M, ZHU Q M, ZHOU G D. Discourse primary-secondary relationships in natural language processing[J]. Chinese Journal of Computers, 2017, 40(4):842-860.)
[10] KIPF T N, WELLING M. Semi-supervised classification with graph convolutional networks[EB/OL]. (2017-02-22)[2020-12-03]. https://arxiv.org/pdf/1609.02907.pdf.
[11] XU Y, LAU J H, BALDWIN T, et al. Decoupling encoder and decoder networks for abstractive document summarization[C]//Proceedings of the MultiLing 2017 Workshop on Summarization and Summary Evaluation Across Source Types and Genres. Stroudsburg, PA:Association for Computational Linguistics, 2017:7-11.
[12] RUSH A M, CHOPRA S, WESTON J. A neural attention model for abstractive sentence summarization[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA:Association for Computational Linguistics, 2015:379-389.
[13] ZHOU Q Y, YANG N, WEI F R, et al. Selective encoding for abstractive sentence summarization[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA:Association for Computational Linguistics, 2017, 1:1095-1104.
[14] SEE A, LIU P J, MANNING C D. Get to the point:summarization with pointer-generator networks[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA:Association for Computational Linguistics, 2017, 1:1073-1083.
[15] COHAN A, DERNONCOURT F, KIM D S, et al. A discourseaware attention model for abstractive summarization of long documents[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies. Stroudsburg, PA:Association for Computational Linguistics, 2018, 2:615-621.
[16] 吴仁守, 张宜飞, 王红玲, 等. 基于层次结构的生成式自动文摘[J]. 中文信息学报, 2019, 33(10):90-98.(WU R S, ZHANG Y F, WANG H L, et al. Abstractive summarization based on hierarchical structure[J]. Journal of Chinese Information Processing, 2019, 33(10):90-98.)
[17] 张迎, 张宜飞, 王中卿, 等. 基于主次关系特征的自动文摘方法[J]. 计算机科学, 2020, 47(6A):6-11.(ZHANG Y, ZHANG Y F, WANG Z Q, et al. Automatic summarization method based on primary and secondary relation features[J]. Computer Science, 2020, 47(6A):6-11.)
[18] HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 1997, 9(8):1735-1780.
[19] ZHOU P, SHI W, TIAN J, et al. Attention-based bidirectional long short-term memory networks for relation classification[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA:Association for Computational Linguistics, 2016, 2:207-212.
[20] LIN C Y, GAO J F, CAO G H, et al. Automatic evaluation of summaries:U. S., 2008/0189074 A1[P]. 2008-07-08.
[21] VANDERWENDE L, SUZUKI H, BROCKETT C, et al. Beyond SumBasic:task-focused summarization with sentence simplification and lexical expansion[J]. Information Processing and Management, 2007, 43(6):1606-1618.
[22] STEINBERGER J, JEŽEK K. Using latent semantic analysis in text summarization and summary evaluation[EB/OL].[2021-10-12]. http://www.kiv.zcu.cz/~jstein/publikace/isim2004.pdf.
[23] ERKAN G, RADEV D R. LexRank:graph-based lexical centrality as salience in text summarization[J]. Journal of Artificial Intelligence Research, 2004, 22:457-479.
[24] NALLAPATI R, ZHOU B W, DOS SANTOS C, et al. Abstractive text summarization using sequence-to-sequence RNNs and beyond[C]//Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning. Stroudsburg, PA:Association for Computational Linguistics, 2016:280-290.