CCDM2022+223 J-SGPGN：基于序列与图的联合学习复述生成网络

• •

CCDM2022+223 J-SGPGN：基于序列与图的联合学习复述生成网络

侯志荣,范晓东,张华,马晓楠

工银科技有限公司

收稿日期:2022-05-05 修回日期:2022-05-13 发布日期:2022-06-29
通讯作者: 侯志荣

J-SGPGN: Paraphrase Generation Networks based on Joint Learning of Sequence and Graph

Received:2022-05-05 Revised:2022-05-13 Online:2022-06-29

摘要/Abstract

摘要： 复述生成是一种基于自然语言生成的文本数据增强方法。针对基于Seq2Seq框架的复述生成方法中出现的生成重复、语意错误及多样性差的问题，提出一种基于序列和图的联合学习复述生成网络（J-SGPGN）。首先，J-SGPGN的编码器融合了图编码和序列编码进行特征增强。然后，解码器设计了序列生成和图生成两种解码方式并行解码。最后，采用联合学习方法训练模型，旨在兼顾句法监督与语义监督同步提升生成的准确性和多样性。实验结果表明，J-SGPGN在Quora数据集上的生成准确性指标METEOR较准确性最优基线模型（RNN+GCN）提升了6.69%；生成多样性指标Self-BLEU较多样性最优基线模型（BTmPG）提升了3.75%。验证了J-SGPGN能够生成语义更准确，表达方式更多样的复述文本。

关键词: 复述生成, 编码器-解码器, 自注意力网络, 序列生成, 图生成, 联合学习

Abstract: Abstract: Paraphrase generation is a text data argumentation method based on natural language generation. Concerning the problems of repetitive generation, semantic errors and poor diversity in paraphrase generation methods based on the Seq2Seq framework, a paraphrase generation network based on joint learning of sequence and graph (J-SGPGN) was proposed. First, the encoder of J-SGPGN fused graph encoding and sequence encoding for feature enhancement. Then, the decoder was designed to contain two decoding methods, sequence generation and graph generation, for parallel decoding. Finally, the joint learning method was used to train the model, which aimed to simultaneously improve the accuracy and diversity of generation by taking into account both syntactic supervision and semantic supervision. The experimental results show that the generation accuracy evaluation indicator METEOR of J-SGPGN on the Quora dataset was 6.69% higher than the accuracy optimal baseline model (RNN+GCN). The generative diversity evaluation indicator Self-BLEU was 3.75% higher than the diversity optimal baseline model (BTmPG). It was verified that J-SGPGN could generate paraphrase texts with more accurate semantics and more diverse expressions.

Key words: paraphrase generation, encoder-decoder, self-attention network, sequence generation, graph generation, joint learning

中图分类号:

TP391

侯志荣范晓东张华马晓楠. CCDM2022+223 J-SGPGN：基于序列与图的联合学习复述生成网络[J]. 计算机应用.

[1]	王朱佳, 余宙, 俞俊, 范建平. 基于多尺度时空Transformer的视频动态场景图生成模型[J]. 《计算机应用》唯一官方网站, 2024, 44(1): 47-57.
[2]	孙男男, 朴春慧, 马新娜. 基于社交关系和时序信息的团购推荐方法[J]. 《计算机应用》唯一官方网站, 2023, 43(6): 1719-1729.
[3]	侯志荣, 范晓东, 张华, 马晓楠. J-SGPGN：基于序列与图的联合学习复述生成网络[J]. 《计算机应用》唯一官方网站, 2023, 43(5): 1365-1371.
[4]	王若莹, 吕凡, 赵柳清, 胡伏原. 融合用户需求和边界约束的平面图生成算法[J]. 《计算机应用》唯一官方网站, 2023, 43(2): 575-582.
[5]	钟建华, 邱创一, 巢建树, 明瑞成, 钟剑锋. 基于语义引导自注意力网络的换衣行人重识别模型[J]. 《计算机应用》唯一官方网站, 2023, 43(12): 3719-3726.
[6]	仇天昊, 陈淑荣. 基于EfficientNet的双分路多尺度联合学习行人再识别[J]. 《计算机应用》唯一官方网站, 2022, 42(7): 2065-2071.
[7]	胡鹤轩, 隋华超, 胡强, 张晔, 胡震云, 马能武. 基于图注意力网络与双阶注意力机制的径流预报模型[J]. 《计算机应用》唯一官方网站, 2022, 42(5): 1607-1615.
[8]	陈亭秀, 尹建芹. 基于关键帧筛选网络的视听联合动作识别[J]. 《计算机应用》唯一官方网站, 2022, 42(3): 731-735.
[9]	曾兰兰, 王以松, 陈攀峰. 基于BERT和联合学习的裁判文书命名实体识别[J]. 《计算机应用》唯一官方网站, 2022, 42(10): 3011-3017.
[10]	杜嘻嘻, 程华, 房一泉. 基于优势演员-评论家算法的强化自动摘要模型[J]. 计算机应用, 2021, 41(3): 699-705.
[11]	丁相国, 桑基韬. 基于关系自适应解码的实体关系联合抽取[J]. 计算机应用, 2021, 41(1): 29-35.
[12]	张心怡, 冯仕民, 丁恩杰. 面向煤矿的实体识别与关系抽取模型[J]. 计算机应用, 2020, 40(8): 2182-2188.
[13]	王敏蕊, 高曙, 袁自勇, 袁蕾. 基于动态路由序列生成模型的多标签文本分类方法[J]. 计算机应用, 2020, 40(7): 1884-1890.
[14]	杨文霞, 王萌, 张亮. 基于密集连接块U-Net的语义人脸图像修复[J]. 计算机应用, 2020, 40(12): 3651-3657.