• •    

基于扩散序列的多元可控文本生成

李晨阳1,张龙1,郑秋生1,钱少华2   

  1. 1. 中原工学院
    2. 重庆长安汽车软件科技有限公司
  • 收稿日期:2023-08-24 修回日期:2023-11-04 发布日期:2023-12-18
  • 通讯作者: 张龙
  • 基金资助:
    河南省高等学校重点科研项目;嵩山实验室预研项目;中原工学院自然科学基金

Multivariate controllable text generation based on diffusion sequences

  • Received:2023-08-24 Revised:2023-11-04 Online:2023-12-18

摘要: 摘 要: 随着大规模预训练语言模型的出现,文本生成技术已经取得了突破性进展。然而,在开放性的文本生成领域,生成的内容缺乏拟人化的情感特征,使得生成的文本难以让人产生共鸣和情感上的联系,而可控文本生成在弥补当前文本生成技术不足方面具有重要意义。首先在ChnSentiCorp数据集的基础上完成了主题和情感属性的扩展,同时,为了构建一个可生成流畅文本且情感丰富的多元可控文本生成模型,提出一种基于扩散序列的可控文本生成模型DiffuSeq-PT。该模型以扩散模型为基础架构,利用主题情感属性和文本数据在无分类器引导条件下对序列做扩散过程,使用预训练模型ERNIE(Enhanced Representation through Knowledge Integration)的编码解码的能力贴合扩散模型的加噪去噪过程,最终可生成符合相关主题和多情感粒度的目标文本。与基准模型DiffuSeq相比,本文的模型在2个公开的真实数据集(ChnSentiCorp和辩论数据集)上分别取得了0.13和0.01BERTScore值的提升,同时困惑度分别下降了14.318和9.46。

关键词: 扩散模型, 序列扩散, 预训练模型, 文本生成, 可控生成, 细粒度情感

Abstract: Abstract: With the emergence of large-scale pre trained language models, text generation technology has made breakthrough progress. However, in the field of open text generation, the generated content lacks anthropomorphic emotional features, making it difficult for the generated text to resonate and connect emotionally. Controllable text generation is of great significance in compensating for the shortcomings of current text generation technologies. Firstly, the extension of theme and emotional attributes was completed on the basis of the ChnSensiCorp dataset. At the same time, in order to construct a multi variable controllable text generation model that can generate smooth text and rich emotions, a diffusion sequence based controllable text generation model DiffuSeq PT was proposed. This model is based on a diffusion model architecture, utilizing theme emotion attributes and text data to perform the diffusion process on the sequence under the guidance of no classifier. The encoding and decoding capabilities of the pre trained model ERNIE(Enhanced Representation through Knowledge Integration) are used to fit the noise and denoising process of the diffusion model, and ultimately, target text that matches the relevant theme and multi sentiment granularity can be generated. Compared with the benchmark model DiffuSeq, the proposed model achieved an improvement in BERTScore values of 0.014-0.128 on two publicly available real datasets (ChnSensiCorp and Debate datasets), while reducing confusion by 9.4-14.3.

Key words: diffusion model, Sequence diffusion, Pre-training model, Text generation, Controllable generation, Fine-grained emotion