Multivariate controllable text generation based on diffusion sequences

doi:10.11772/j.issn.1001-9081.2023081137

Journal of Computer Applications ›› 2024, Vol. 44 ›› Issue (8): 2414-2420.DOI: 10.11772/j.issn.1001-9081.2023081137

• Artificial intelligence • Previous Articles Next Articles

Multivariate controllable text generation based on diffusion sequences

Chenyang LI¹^,², Long ZHANG¹^,²(), Qiusheng ZHENG¹^,², Shaohua QIAN³

^1.Frontier Information Technology Research Institute，Zhongyuan University of Technology，Zhengzhou Henan 450007，China
^2.Henan Key Laboratory on Public Opinion Intelligent Analysis，Zhengzhou Henan 450007，China
^3.Chongqing Changan Automobile Software Technology Company Limited，Chongqing 433000，China

Received:2023-08-24 Revised:2023-11-04 Accepted:2023-11-14 Online:2024-08-22 Published:2024-08-10
Contact: Long ZHANG
About author:LI Chenyang， born in 1999， M. S. candidate. His researchinterests include natural language processing， text generation.
ZHANG Long， born in 1984， Ph. D.， lecturer. His researchinterests include machine learning， natural language processing.
ZHENG Qiusheng ， born in 1965， M. S.， professor. His researchinterests include natural language processing， network security.
QIAN Shaohua ， born in 1986， Ph. D.， senior engineer. Hisresearch interests include deep learning， autonomous driving.
Supported by:
This work is partially supported by Key Research Project of HenanHigher Education Institutions （22B520054）； Songshan LaboratoryPre-research Project （YYJC032022021）； Natural Science Foundation ofZhongyuan University of Technology（ K2023MS021）.

基于扩散序列的多元可控文本生成

李晨阳¹^,², 张龙¹^,²(), 郑秋生¹^,², 钱少华³

^1.中原工学院前沿信息技术研究院，郑州 450007
^2.河南省网络舆情监测与智能分析重点实验室，郑州 450007
^3.重庆长安汽车软件科技有限公司，重庆 433000

通讯作者: 张龙
作者简介:李晨阳（1999—），男，河南安阳人，硕士研究生，CCF会员，主要研究方向：自然语言处理、文本生成
张龙（1984—），男，河南郑州人，讲师，博士，CCF会员，主要研究方向：机器学习、自然语言处理 zhanglong@zut.edu.cn
郑秋生（1965—），男，河南郑州人，教授，硕士，CCF高级会员，主要研究方向：自然语言处理、网络安全
钱少华（1986—），男，江苏张家港人，高级工程师，博士，主要研究方向：深度学习、自动驾驶。
基金资助:
河南省高等学校重点科研项目(22B520054);嵩山实验室预研项目(YYJC032022021);中原工学院自然科学基金资助项目(K2023MS021)

Abstract

Abstract:

With the emergence of large-scale pre-trained language models， text generation technology has made breakthrough progress. However， in the field of open text generation， the generated content lacks anthropomorphic emotional features， making it difficult for the generated text to resonate and connect emotionally. Controllable text generation is of great significance in compensating for the shortcomings of current text generation technology. Firstly， the extension of theme and emotional attributes was completed on the basis of the ChnSensiCorp dataset. At the same time， in order to construct a multivariate controllable text generation model that could generate smooth text with rich emotion， a diffusion sequence based controllable text generation model DiffuSeq-PT was proposed based on a diffusion model architecture. Theme emotion attributes and text data were used to perform the diffusion process on the sequences without the guidance of classifier. The encoding and decoding capabilities of the pre-trained model ERNIE 3.0（Large-scale Knowledge Enhanced Pre-training for Language Understanding and Generation） were used to fit the noising and denoising process of the diffusion model， and ultimately， target text that matched the relevant theme and multiple sentiment granularities were generated. Compared with the benchmark model DiffuSeq， the proposed model achieved an improvement of 0.13 and 0.01 in BERTScore on two publicly available real datasets （ChnSentiCorp and Debate dataset）， and decreased the perplexity by 14.318 and 9.46.

Key words: diffusion model, sequence diffusion, pre-trained model, prompt, text generation, controllable generation, fine-grained emotion

摘要：

随着大规模预训练语言模型的出现，文本生成技术已取得突破性进展。然而，在开放性文本生成领域，生成的内容缺乏拟人化的情感特征，使生成的文本难以让人产生共鸣和情感上的联系，可控文本生成在弥补当前文本生成技术不足方面具有重要意义。首先，在ChnSentiCorp数据集的基础上完成主题和情感属性的扩展，同时，为构建一个可生成流畅文本且情感丰富的多元可控文本生成模型，提出一种基于扩散序列的可控文本生成模型DiffuSeq-PT。该模型以扩散模型为基础架构，利用主题情感属性和文本数据在无分类器引导条件下对序列执行扩散过程，使用预训练模型ERNIE 3.0（Large-scale Knowledge Enhanced Pre-training for Language Understanding and Generation）的编码解码能力贴合扩散模型的加噪去噪过程，最终生成符合相关主题和多情感粒度的目标文本。与基准模型DiffuSeq相比，所提模型在2个公开的真实数据集（ChnSentiCorp和辩论数据集）上分别取得0.13和0.01的BERTScore值的提升，困惑度分别下降了14.318和9.46。

关键词: 扩散模型, 序列扩散, 预训练模型, 提示, 文本生成, 可控生成, 细粒度情感

CLC Number:

TP191

Chenyang LI, Long ZHANG, Qiusheng ZHENG, Shaohua QIAN. Multivariate controllable text generation based on diffusion sequences[J]. Journal of Computer Applications, 2024, 44(8): 2414-2420.

李晨阳, 张龙, 郑秋生, 钱少华. 基于扩散序列的多元可控文本生成[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2414-2420.

Figures/Tables 12

References 33

1	刘欣然，徐雅斌，李继先. 网评贴文自动生成方法研究［J］. 数据分析与知识发现， 2023， 7（4）： 101-113.
	LIU X R， XU Y B， LI J X. Method for automatically generating online comments［J］. Data Analysis and Knowledge Discovery， 2023， 7（4）： 101-113.
2	GEHRING J， AULI M， GRANGIER D， et al. Convolutional sequence to sequence learning［C］// Proceedings of the 2017 International Conference on Machine Learning. New York： JMLR.org，2017： 1243-1252.
3	林志兴，王立可.基于深度特征和Seq2Seq模型的网络态势预测方法［J］.计算机应用，2020， 40（8）： 2241-2247.
	LIN Z X， WANG L K. Network situation prediction method based on deep feature and Seq2Seq model ［J］. Journal of Computer Applications，2020， 40（8）： 2241-2247.
4	VASWANI A， SHAZEER N， PARMAR N， et al. Attention is all you need［J］. Advances in Neural Information Processing Systems， 2017， 30： 6000-6010.
5	LEWIS M， LIU Y， GOYAL N， et al. BART： denoising sequence-to-sequence pre-training for natural language generation， translation， and comprehension ［EB/OL］. （2019-10-29）［2023-07-01］. .
6	FRÖHLING L， ZUBIAGA A. Feature-based detection of automated language models： tackling GPT-2， GPT-3 and Grover［J］. PeerJ Computer Science， 2021， 7： e443.
7	ZHANG H， SONG H， LI S， et al. A survey of controllable text generation using Transformer-based pre-trained language models［J］. ACM Computing Surveys， 2023， 56（3）： Article No. 64.
8	SARZYNSKA-WAWER J， WAWER A， PAWLAK A， et al. Detecting formal thought disorder by deep contextualized word representations［J］. Psychiatry Research， 2021， 304： 114135.
9	YANG K， LIU D， LEI W， et al. Tailor： a prompt-based approach to attribute-based controlled text generation ［EB/OL］. （2022-04-18）［2023-07-01］. .
10	LI J， TANG T， NIE J-Y， et al. Learning to transfer prompts for text generation ［EB/OL］. （2022-05-15）［2023-07-01］. .
11	GHOSH S， CHOLLET M， LAKSANA E， et al. Affect-LM： a neural language model for customizable affective text generation ［EB/OL］.［2023-07-01］. .
12	DATHATHRI S， MADOTTO A， LAN J， et al. Plug and play language models： a simple approach to controlled text generation ［EB/OL］. ［2022-03-03］. .
13	CHAN A， Y-S ONG， PUNG B， et al. CoCon： a self-supervised approach for controlled text generation ［EB/OL］. ［2022-06-10］. .
14	HO J， JAIN A， ABBEEL P. Denoising diffusion probabilistic models［J］. Advances in Neural Information Processing Systems， 2020， 33： 6840-6851.
15	NICHOL A Q， DHARIWAL P. Improved denoising diffusion probabilistic models［C］// Proceedings of the 2021 International Conference on Machine Learning. New York： JMLR.org， 2021： 8162-8171.
16	CRESWELL A， WHITE T， DUMOULIN V， et al. Generative adversarial networks： an overview［J］. IEEE Signal Processing Magazine， 2018， 35（1）： 53-65.
17	AUSTIN J， JOHNSON D D， HO J， et al. Structured denoising diffusion models in discrete state-spaces［J］. Advances in Neural Information Processing Systems， 2021， 34： 17981-17993.
18	LI X L， THICKSTUN J， GULRAJANI I， et al. Diffusion-LM improves controllable text generation［EB/OL］. （2022-05-27）［2023-07-01］. .
19	GONG S， LI M， FENG J， et al. DiffuSeq： Sequence to sequence text generation with diffusion models ［EB/OL］. ［2023-02-14］. .
20	KESKAR N S， McCANN B， VARSHNEY L R， et al. CTRL： a conditional transformer language model for controllable generation ［EB/OL］. ［2023-07-01］. .
21	YANG K， KLEIN D. FUDGE： controlled text generation with future discriminators ［EB/OL］. ［2021-08-15］. .
22	李雪晴，王石，王朱君，等. 自然语言生成综述［J］. 计算机应用， 2021， 41（5）： 1227-1235.
	LI X Q， WANG S， WANG Z J， et al. Summarization of natural language generation［J］. Journal of Computer Applications， 2021， 41（5）： 1227-1235.
23	刘晓明，张兆晗，杨晨阳，等.在线社交网络文本内容对抗技术［J］.计算机学报，2022，45（8）：1571-1597.
	LIU X M， ZHANG Z H， YANG C Y， et al. Adversarial technology of text content on online social networks［J］. Chinese Journal of Computers， 2022， 45（8）： 1571-1597.
24	YANG L， ZHANG Z， SONG Y， et al. Diffusion models： a comprehensive survey of methods and applications ［EB/OL］. ［2023-10-11］. .
25	SUN Y， WANG S， FENG S， et al. ERNIE 3.0： Large-scale knowledge enhanced pre-training for language understanding and generation ［EB/OL］. （2021-07-05）［2023-07-01］. .
26	LI M， LONG Y， QIN L， et al. Emotion corpus construction based on selection from hashtags［C］// Proceedings of the Tenth International Conference on Language Resources and Evaluation. Paris： European Language Resources Association，2016： 1845-1849.
27	YUAN J， CHENG L， HE R， et al. Overview of argumentative text understanding for AI debater challenge［C］// Proceedings of the 2021 International Conference on Natural Language Processing and Chinese Computing. Cham： Springer， 2021： 548-568.
28	REITER E. A structured review of the validity of BLEU［J］.Computational Linguistics， 2018， 44（3）：393-401.
29	WIETING J， BERG-KIRKPATRICK T， GIMPEL K， et al. Beyond BLEU： training neural machine translation with semantic similarity ［EB/OL］. （2019-09-14）［2023-07-01］. .
30	ZHANG T， KISHORE V， WU F， et al. BERTScore： evaluating text generation with BERT ［EB/OL］. （2020-02-24）［2023-07-01］. .
31	MEISTER C， COTTERELL R. Language model evaluation beyond perplexity［C］// Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. Stroudsburg： ACL， 2021： 5328-5339.
32	YUAN H， YUAN Z， TAN C， et al. SeqDiffuSeq： text diffusion with encoder-decoder transformers ［EB/OL］. ［2023-05-22］. .
33	LI C， ZHANG L， ZHENG Q， et al. User preference prediction for online dialogue systems based on pre-trained large model［C］// Proceedings of the 2023 International Conference on Natural Language Processing and Chinese Computing. Cham： Springer， 2023： 349-357.

主题属性	评论数	主题属性	评论数
电脑	3 407	图书	2 773
酒店	2 910

主题属性	评论数	主题属性	评论数
电脑	3 407	图书	2 773
酒店	2 910

情感属性	评论数	情感属性	评论数
悲伤	2 741	喜欢	2 435
害怕	13	幸福	2 100
愤怒	824	厌恶	749
惊讶	228

情感属性	评论数	情感属性	评论数
悲伤	2 741	喜欢	2 435
害怕	13	幸福	2 100
愤怒	824	厌恶	749
惊讶	228

方法	预训练模型	时间步	PPL↓
方法	预训练模型	时间步	数据集A	数据集B
DiffuSeq-PT	ERNIE	32	210.110	145.930
		64	63.780	83.650
		256	46.350	44.590
		512	20.100	33.735
D3PM	无	512	225.150	152.750
Diffusion-LM	无	2 000	196.164	130.145
DiffuSeq	无	2 000	34.418	43.197
SeqDiffuSeq	无	2 000	67.877	47.917
GPT-2	GPT	1	38.700	35.780

Multivariate controllable text generation based on diffusion sequences

基于扩散序列的多元可控文本生成

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 12

References 33

Related Articles 15

Recommended Articles

Metrics

方法	数据集A				数据集B
方法	BLEU↑	self_BLEU↓	BERTScore↑	PPL↓	BLEU↑	self_BLEU↓	BERTScore↑	PPL↓
Diffusion-LM	0.256	0.402	0.547	196.164	0.268	0.451	0.587	130.145
DiffuSeq	0.478	0.499	0.567	34.418	0.875	0.917	0.930	43.197
SeqDiffuSeq	0.476	0.571	0.589	67.877	0.501	0.627	0.711	47.917
DiffuSeq-PT（无prompt）	0.557	0.454	0.691	21.493	0.923	0.568	0.932	35.917
DiffuSeq-PT	0.569	0.450	0.697	20.100	0.958	0.565	0.944	33.735

控制属性	BLEU↑	self_BLEU↓	BERTScore↑	PPL↓
无属性	0.567	0.430	0.695	21.069
主题	0.565	0.461	0.697	20.372
2情感	0.565	0.465	0.697	20.493
7情绪	0.571	0.474	0.696	21.037
主题+2情感	0.569	0.466	0.697	20.100
主题+7情绪	0.559	0.450	0.694	20.675

控制属性	BLEU↑	self_BLEU↓	BERTScore↑	PPL↓
无属性	0.962	0.553	0.949	33.352
辩题	0.965	0.552	0.943	34.424
立场	0.959	0.566	0.941	33.550
辩题+立场	0.958	0.565	0.944	33.735

主题	情感	输出内容	是否符合
酒店	幸福	服务态度不错，位置不错，停车方便。房间很不错，房间比较大，干净。	√
	喜欢	房间还不错，环境还不错。	√
	害怕	价格偏高，服务太差了。	×
电脑	悲伤	不好看，散热还是有点问题，内存太小了。	√
	喜欢	性价比比较均衡，配置均衡，性能散热都很不错，电池4小时，使用中来说还行。	√
	幸福	本本做工不错，价格又高。同价位的机器，性价比还是比较高的。	√
图书	喜欢	这本书的质量很不错，虽然其中的书中的内容不是通俗易懂的部分，但却是一本值得一读的一本书。	√
	愤怒	图书的内容有点搞人的感觉，！	√
	悲伤	不太喜欢一些人的描写，感觉不是一些很实用的东东。	√

辩题	辩方	输出内容	是否符合
被同化比被排斥可怕	正方	谢谢主席大家好，今天我方认为被同化更可怕。个体必然生活在某种环境中，所以被排斥或被同化的前提是个体与生活的环境存在差异。我们发现任何东西可怕的原因都是它让我们失去了可贵的东西，所以今天我们判断更可怕的标准是被排斥和被同化，何者让我们失去的东西更珍贵。	√
被同化比被排斥可怕	反方	而且我方还发现很多时候被排斥会带来很多心理疾病，甚至克服了人对死亡最本能的恐惧，所以我发现觉得这个负面影响是无限大的，你方从头到尾都没有跟我方论证的一个东西，就是被同化在本质上究竟失去了什么？	√

[1]	Yuxin HUANG, Jialong XU, Zhengtao YU, Shukai HOU, Jiaqi ZHOU. Unsupervised text sentiment transfer method based on generation prompt [J]. Journal of Computer Applications, 2024, 44(9): 2667-2673.
[2]	Qi SHUAI, Hairui WANG, Guifu ZHU. Chinese story ending generation model based on bidirectional contrastive training [J]. Journal of Computer Applications, 2024, 44(9): 2683-2688.
[3]	Xindong YOU, Yingzi WEN, Xinpeng SHE, Xueqiang LYU. Triplet extraction method for mine electromechanical equipment field [J]. Journal of Computer Applications, 2024, 44(7): 2026-2033.
[4]	Chao WEI, Yanping CHEN, Kai WANG, Yongbin QIN, Ruizhang HUANG. Relation extraction method based on mask prompt and gated memory network calibration [J]. Journal of Computer Applications, 2024, 44(6): 1713-1719.
[5]	Junfeng SHEN, Xingchen ZHOU, Can TANG. Dual-channel sentiment analysis model based on improved prompt learning method [J]. Journal of Computer Applications, 2024, 44(6): 1796-1806.
[6]	Zhengyu ZHAO, Jing LUO, Xinhui TU. Information retrieval method based on multi-granularity semantic fusion [J]. Journal of Computer Applications, 2024, 44(6): 1775-1780.
[7]	Xinyan YU, Cheng ZENG, Qian WANG, Peng HE, Xiaoyu DING. Few-shot news topic classification method based on knowledge enhancement and prompt learning [J]. Journal of Computer Applications, 2024, 44(6): 1767-1774.
[8]	Jinsong XU, Ming ZHU, Zhiqiang LI, Shijie GUO. Location control method for generated objects by diffusion model with exciting and pooling attention [J]. Journal of Computer Applications, 2024, 44(4): 1093-1098.
[9]	Yingjie GAO, Min LIN, Siriguleng, Bin LI, Shujun ZHANG. Prompt learning method for ancient text sentence segmentation and punctuation based on span-extracted prototypical network [J]. Journal of Computer Applications, 2024, 44(12): 3815-3822.
[10]	Yusheng LIU, Xuezhong XIAO. High-fidelity image editing based on fine-tuning of diffusion model [J]. Journal of Computer Applications, 2024, 44(11): 3574-3580.
[11]	Xiang LIN, Biao JIN, Weijing YOU, Zhiqiang YAO, Jinbo XIONG. Model integrity verification framework of deep neural network based on fragile fingerprint [J]. Journal of Computer Applications, 2024, 44(11): 3479-3486.
[12]	Yushan JIANG, Yangsen ZHANG. Large language model-driven stance-aware fact-checking [J]. Journal of Computer Applications, 2024, 44(10): 3067-3073.
[13]	Jinke DENG, Wenjie DUAN, Shunxiang ZHANG, Yuqing WANG, Shuyu LI, Jiawei LI. Complex causal relationship extraction based on prompt enhancement and bi-graph attention network [J]. Journal of Computer Applications, 2024, 44(10): 3081-3089.
[14]	Yuelin TIAN, Ruizhang HUANG, Lina REN. Scholar fine-grained information extraction method fused with local semantic features [J]. Journal of Computer Applications, 2023, 43(9): 2707-2714.
[15]	Xinyue ZHANG, Rong LIU, Chiyu WEI, Ke FANG. Aspect-based sentiment analysis method with integrating prompt knowledge [J]. Journal of Computer Applications, 2023, 43(9): 2753-2759.