Journal of Computer Applications ›› 2024, Vol. 44 ›› Issue (8): 2414-2420.DOI: 10.11772/j.issn.1001-9081.2023081137

• Artificial intelligence • Previous Articles    

Multivariate controllable text generation based on diffusion sequences

Chenyang LI1,2, Long ZHANG1,2(), Qiusheng ZHENG1,2, Shaohua QIAN3   

  1. 1.Frontier Information Technology Research Institute,Zhongyuan University of Technology,Zhengzhou Henan 450007,China
    2.Henan Key Laboratory on Public Opinion Intelligent Analysis,Zhengzhou Henan 450007,China
    3.Chongqing Changan Automobile Software Technology Company Limited,Chongqing 433000,China
  • Received:2023-08-24 Revised:2023-11-04 Accepted:2023-11-14 Online:2024-08-22 Published:2024-08-10
  • Contact: Long ZHANG
  • About author:LI Chenyang, born in 1999, M. S. candidate. His researchinterests include natural language processing, text generation.
    ZHANG Long, born in 1984, Ph. D., lecturer. His researchinterests include machine learning, natural language processing.
    ZHENG Qiusheng , born in 1965, M. S., professor. His researchinterests include natural language processing, network security.
    QIAN Shaohua , born in 1986, Ph. D., senior engineer. Hisresearch interests include deep learning, autonomous driving.
  • Supported by:
    This work is partially supported by Key Research Project of HenanHigher Education Institutions (22B520054) ; Songshan LaboratoryPre-research Project (YYJC032022021); Natural Science Foundation ofZhongyuan University of Technology( K2023MS021).

基于扩散序列的多元可控文本生成

李晨阳1,2, 张龙1,2(), 郑秋生1,2, 钱少华3   

  1. 1.中原工学院 前沿信息技术研究院,郑州 450007
    2.河南省网络舆情监测与智能分析重点实验室,郑州 450007
    3.重庆长安汽车软件科技有限公司,重庆 433000
  • 通讯作者: 张龙
  • 作者简介:李晨阳(1999—),男,河南安阳人,硕士研究生,CCF会员,主要研究方向:自然语言处理、文本生成
    张龙(1984—),男,河南郑州人,讲师,博士,CCF会员,主要研究方向:机器学习、自然语言处理 zhanglong@zut.edu.cn
    郑秋生(1965—),男,河南郑州人,教授,硕士,CCF高级会员,主要研究方向:自然语言处理、网络安全
    钱少华(1986—),男,江苏张家港人,高级工程师,博士,主要研究方向:深度学习、自动驾驶。
  • 基金资助:
    河南省高等学校重点科研项目(22B520054);嵩山实验室预研项目(YYJC032022021);中原工学院自然科学基金资助项目(K2023MS021)

Abstract:

With the emergence of large-scale pre-trained language models, text generation technology has made breakthrough progress. However, in the field of open text generation, the generated content lacks anthropomorphic emotional features, making it difficult for the generated text to resonate and connect emotionally. Controllable text generation is of great significance in compensating for the shortcomings of current text generation technology. Firstly, the extension of theme and emotional attributes was completed on the basis of the ChnSensiCorp dataset. At the same time, in order to construct a multivariate controllable text generation model that could generate smooth text with rich emotion, a diffusion sequence based controllable text generation model DiffuSeq-PT was proposed based on a diffusion model architecture. Theme emotion attributes and text data were used to perform the diffusion process on the sequences without the guidance of classifier. The encoding and decoding capabilities of the pre-trained model ERNIE 3.0(Large-scale Knowledge Enhanced Pre-training for Language Understanding and Generation) were used to fit the noising and denoising process of the diffusion model, and ultimately, target text that matched the relevant theme and multiple sentiment granularities were generated. Compared with the benchmark model DiffuSeq, the proposed model achieved an improvement of 0.13 and 0.01 in BERTScore on two publicly available real datasets (ChnSentiCorp and Debate dataset), and decreased the perplexity by 14.318 and 9.46.

Key words: diffusion model, sequence diffusion, pre-trained model, prompt, text generation, controllable generation, fine-grained emotion

摘要:

随着大规模预训练语言模型的出现,文本生成技术已取得突破性进展。然而,在开放性文本生成领域,生成的内容缺乏拟人化的情感特征,使生成的文本难以让人产生共鸣和情感上的联系,可控文本生成在弥补当前文本生成技术不足方面具有重要意义。首先,在ChnSentiCorp数据集的基础上完成主题和情感属性的扩展,同时,为构建一个可生成流畅文本且情感丰富的多元可控文本生成模型,提出一种基于扩散序列的可控文本生成模型DiffuSeq-PT。该模型以扩散模型为基础架构,利用主题情感属性和文本数据在无分类器引导条件下对序列执行扩散过程,使用预训练模型ERNIE 3.0(Large-scale Knowledge Enhanced Pre-training for Language Understanding and Generation)的编码解码能力贴合扩散模型的加噪去噪过程,最终生成符合相关主题和多情感粒度的目标文本。与基准模型DiffuSeq相比,所提模型在2个公开的真实数据集(ChnSentiCorp和辩论数据集)上分别取得0.13和0.01的BERTScore值的提升,困惑度分别下降了14.318和9.46。

关键词: 扩散模型, 序列扩散, 预训练模型, 提示, 文本生成, 可控生成, 细粒度情感

CLC Number: