基于语义前缀微调的零样本对话状态跟踪领域迁移模型

doi:10.11772/j.issn.1001-9081.2024060865

《计算机应用》唯一官方网站 ›› 2025, Vol. 45 ›› Issue (7): 2221-2228.DOI: 10.11772/j.issn.1001-9081.2024060865

基于语义前缀微调的零样本对话状态跟踪领域迁移模型

孙雨阳¹, 张敏婕², 胡婕¹^,³^,⁴()

^1.湖北大学计算机与信息工程学院，武汉 430062
^2.湖北大学楚才学院，武汉 430062
^3.大数据智能分析与行业应用湖北省重点实验室（湖北大学），武汉 430062
^4.智慧政务与人工智能应用湖北省工程研究中心（湖北大学），武汉 430062

收稿日期:2024-07-02 修回日期:2024-09-02 接受日期:2024-09-06 发布日期:2025-07-10 出版日期:2025-07-10
通讯作者: 胡婕
作者简介:孙雨阳（2003—），女，湖北孝感人，主要研究方向：自然语言处理
张敏婕（2003—），女，湖北武汉人，主要研究方向：自然语言处理
基金资助:
国家自然科学基金资助项目(61977021)

Zero-shot dialogue state tracking domain transfer model based on semantic prefix-tuning

Yuyang SUN¹, Minjie ZHANG², Jie HU¹^,³^,⁴()

^1.School of Computer Science，Hubei University，Wuhan Hubei 430062，China
^2.Chucai Honors College，Hubei University，Wuhan Hubei 430062，China
^3.Hubei Key Laboratory of Big Data Intelligent Analysis and Application （Hubei University），Wuhan Hubei 430062，China
^4.Hubei Engineering Research Center of Intelligent Government Affairs and Artificial Intelligence Application （Hubei University），Wuhan Hubei 430062，China

Received:2024-07-02 Revised:2024-09-02 Accepted:2024-09-06 Online:2025-07-10 Published:2025-07-10
Contact: Jie HU
About author:SUN Yuyang， born in 2003. Her research interests include natural language processing.
ZHANG Minjie， born in 2003. Her research interests include natural language processing.
Supported by:
National Natural Science Foundation of China(61977021)

摘要/Abstract

摘要：

零样本对话状态跟踪（DST）需要在缺乏标注数据时将已有模型迁移至新领域。现有的相关方法在执行领域迁移时常常难以捕捉对话文本中的上下文联系，导致相关模型在面对未知领域时的泛化能力不足。针对上述问题，提出一种基于语义前缀微调的零样本DST领域迁移模型。首先，利用槽位描述生成初始前缀，确保前缀与对话文本的紧密语义联系；其次，融合前缀位置与领域信息，生成能整合模型内部知识和领域信息的前缀；再次，根据对话内容的复杂性动态调整前缀长度，增强模型对上下文内容的敏感性；最后，通过全局式前缀插入增强模型对历史对话的全局记忆能力。实验结果表明，相较于Prompter模型，所提模型在MultiWOZ2.1数据集的Restaurant、Taxi和Train领域上的联合目标准确率（JGA）分别提高了5.50、0.90和7.50个百分点，在SGD数据集的Messaging、Payment和Trains领域上的JGA分别提高了0.65、14.51和0.65个百分点。可见，所提模型的零样本场景下DST任务的上下文理解能力和泛化迁移性能得到了有效提升。

关键词: 对话状态跟踪, 零样本学习, 领域迁移, 前缀微调, 参数高效迁移学习

Abstract:

Zero-shot Dialogue State Tracking （DST） requires transferring the existing models to new domains without labeled data. The existing related methods often struggle to capture contextual relationships in dialogue text during domain transfer， leading to insufficient generalization of the related models when facing unknown domains. To address this issue， a zero-shot DST domain transfer model based on semantic prefix-tuning was proposed. Firstly， the slot description was utilized to generate an initial prefix， thereby ensuring close semantic connection of the prefix with the dialogue text. Secondly， the prefix position and domain information were integrated to generate a prefix that combines internal knowledge and domain information. Thirdly， the prefix length was adjusted on the basis of the complexity of dialogue content dynamically to enhance the model’s sensitivity to contextual content. Finally， global prefix insertion was employed to enhance global memory ability of the model for dialogue history. Experimental results show that compared with Prompter model， the proposed model increases the Joint Goal Accuracy （JGA） by 5.50， 0.90 and 7.50 percentage points， respectively， in the Restaurant， Taxi and Train domains of MultiWOZ2.1 dataset， and by 0.65， 14.51 and 0.65 percentage points， respectively， in the Messaging， Payment and Trains domains of SGD dataset. It can be seen that the context understanding ability and generalization transfer performance of the proposed model in DST tasks in zero-shot scenarios are improved effectively.

Key words: Dialogue State Tracking (DST), zero-shot learning, domain transfer, prefix-tuning, Parameter-Efficient Transfer Learning (PETL)

中图分类号:

TP183

孙雨阳, 张敏婕, 胡婕. 基于语义前缀微调的零样本对话状态跟踪领域迁移模型[J]. 计算机应用, 2025, 45(7): 2221-2228.

Yuyang SUN, Minjie ZHANG, Jie HU. Zero-shot dialogue state tracking domain transfer model based on semantic prefix-tuning[J]. Journal of Computer Applications, 2025, 45(7): 2221-2228.

图/表 10

参考文献 22

[1]	赵阳洋，王振宇，王佩，等.任务型对话系统研究综述［J］.计算机学报，2020， 43（10）： 1862-1896.
	ZHAO Y Y， WANG Z Y， WANG P， et al. A survey on task-oriented dialogue systems ［J］. Chinese Journal of Computers， 2020， 43（10）： 1862-1896.
[2]	JACQMIN L， BARAHONA L M R， FAVRE B. “Do you follow me？”： a survey of recent approaches in dialogue state tracking ［C］// Proceedings of the 23rd Annual Meeting of the Special Interest Group on Discourse and Dialogue. Stroudsburg： ACL， 2022： 336-350.
[3]	WU C S， MADOTTO A， HOSSEINI-ASL E， et al. Transferable multi-domain state generator for task-oriented dialogue systems ［C］// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg： ACL， 2019： 808-819.
[4]	KUMAR A， KU P， GOYAL A， et al. MA-DST： multi-attention-based scalable dialog state tracking ［C］// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto： AAAI Press， 2020： 8107-8114.
[5]	RASTOGI A， ZANG X， SUNKARA S， et al. Towards scalable multi-domain conversational agents： the schema-guided dialogue dataset ［C］// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto： AAAI Press， 2020： 8689-8696.
[6]	DEVLIN J， CHANG M W， LEE K， et al. BERT： pre-training of deep bidirectional transformers for language understanding ［C］// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics： Human Language Technologies （Volume 1： Long and Short Papers）. Stroudsburg， PA： ACL， 2019： 4171-4186.
[7]	FENG Y， WANG Y， LI H. A sequence-to-sequence approach to dialogue state tracking ［C］// Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing （Volume 1： Long Papers）. Stroudsburg： ACL， 2021： 1714-1725.
[8]	LEE H， LEE J， KIM T Y. SUMBT： slot-utterance matching for universal and scalable belief tracking ［C］// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg： ACL， 2019： 5478-5483.
[9]	SHIN J， YU H， MOON H， et al. Dialogue Summaries as Dialogue States （DS2）， template-guided summarization for few-shot dialogue state tracking ［C］// Findings of the Association for Computational Linguistics： ACL 2022. Stroudsburg： ACL， 2022： 3824-3846.
[10]	RAFFEL C， SHAZEER N， ROBERTS A， et al. Exploring the limits of transfer learning with a unified text-to-text transformer ［J］. Journal of Machine Learning Research， 2020， 21： 1-67.
[11]	LIN Z， LIU B， MADOTTO A， et al. Zero-shot dialogue state tracking via cross-task transfer ［C］// Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Stroudsburg： ACL， 2021： 7890-7900.
[12]	LIN Z， LIU B， MOON S， et al. Leveraging slot descriptions for zero-shot cross-domain dialogue state tracking ［C］// Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics： Human Language Technologies. Stroudsburg： ACL， 2021： 5640-5648.
[13]	HECK M， LUBIS N， RUPPIK B， et al. ChatGPT for zero-shot dialogue state tracking： a solution or an opportunity？［C］// Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics （Volume 2： Short Papers）. Stroudsburg： ACL， 2023： 936-950.
[14]	MAHABADI R K， RUDER S， DEHGHANI M， et al. Parameter-efficient multi-task fine-tuning for transformers via shared hypernetworks ［C］// Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing （Volume 1： Long Papers）. Stroudsburg： ACL， 2021： 565-576.
[15]	WANG Q， DING L， CAO Y， et al. Divide， conquer， and combine： mixture of semantic-independent experts for zero-shot dialogue state tracking ［C］// Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics （Volume 1： Long Papers）. Stroudsburg： ACL， 2023： 2048-2061.
[16]	LI X L， LIANG P. Prefix-tuning： optimizing continuous prompts for generation ［C］// Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing （Volume 1： Long Papers）. Stroudsburg： ACL， 2021： 4582-4597.
[17]	AKSU I T， KAN M Y， CHEN N. Prompter： zero-shot adaptive prefixes for dialogue state tracking domain adaptation ［C］// Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics （Volume 1： Long Papers）. Stroudsburg： ACL， 2023： 4588-4603.
[18]	ERIC M， GOEL R， PAUL S， et al. MultiWOZ 2.1： a consolidated multi-domain dialogue dataset with state corrections and state tracking baselines ［C］// Proceedings of the 12th Language Resources and Evaluation Conference. Paris： European Language Resources Association， 2020： 422-428.
[19]	HA D， DAI A M， LE Q V. HyperNetworks ［EB/OL］. ［2024-06-20］. .
[20]	TENNEY I， DAS D， PAVLICK E. BERT rediscovers the classical NLP pipeline ［C］// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg： ACL， 2019： 4593-4601.
[21]	SU Y， SHU L， MANSIMOV E， et al. Multi-task pre-training for plug-and-play task-oriented dialogue system ［C］// Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics （Volume 1： Long Papers）. Stroudsburg： ACL， 2022： 4661-4676.
[22]	HOFFMAN M D， BLEI D M， BACH F. Online learning for latent Dirichlet allocation ［C］// Proceedings of the 24th International Conference on Neural Information Processing Systems — Volume 1. Red Hook： Curran Associates Inc.， 2010： 856-864.

领域	槽位数	训练集对话数	验证集对话数	测试集对话数
Hotel	10	3 381	406	394
Attraction	3	2 717	401	395
Restaurant	7	3 813	438	437
Taxi	4	1 654	207	195
Train	6	3 103	484	494

领域	槽位数	训练集对话数	验证集对话数	测试集对话数
Hotel	10	3 381	406	394
Attraction	3	2 717	401	395
Restaurant	7	3 813	438	437
Taxi	4	1 654	207	195
Train	6	3 103	484	494

领域	对话数	领域	对话数
Alarm	324	Movies	2 339
Banks	1 021	Music	1 833
Buses	3 135	Payment	222
Calendar	1 602	RentalCars	2 510
Events	4 519	Restaurants	3 218
Flights	3 644	RideSharing	2 223
Homes	1 273	Services	2 956
Hotels	4 992	Trains	350
Media	1 656	Travel	2 808
Messaging	298	Weather	1 783

领域	对话数	领域	对话数
Alarm	324	Movies	2 339
Banks	1 021	Music	1 833
Buses	3 135	Payment	222
Calendar	1 602	RentalCars	2 510
Events	4 519	Restaurants	3 218
Flights	3 644	RideSharing	2 223
Homes	1 273	Services	2 956
Hotels	4 992	Trains	350
Media	1 656	Travel	2 808
Messaging	298	Weather	1 783

参数	值	参数	值
Trasnformer hidden size	512	解码器层数	6
前缀长度	10	Batch size	8
优化器	AdamW	Epoch	5
Learning rate	0.000 1	Dropout（前缀网络）	0.2
注意力头数	8	Sample_dim	64
编码器层数	6	Domain_embedding_dim	64

基于语义前缀微调的零样本对话状态跟踪领域迁移模型

Zero-shot dialogue state tracking domain transfer model based on semantic prefix-tuning

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 10

参考文献 22

相关文章 2

编辑推荐

Metrics

模型	语言模型	训练参数量/10⁶	训练方式	JGA/%
模型	语言模型	训练参数量/10⁶	训练方式	Hotel	Attraction	Restaurant	Taxi	Train	平均
TRADE	—	—	—	13.70	19.87	11.52	60.58	22.37	25.61
MA-DST	—	—	—	16.28	22.46	13.56	59.27	22.76	26.87
SUMBT	BERT-base	440	fine-tune	19.80	22.60	16.50	59.50	22.50	28.18
TransferQA	T5-large	770	fine-tune	22.72	31.25	26.28	61.87	36.72	35.77
T5DST	T5-small	60	fine-tune	18.73	32.66	20.55	64.62	31.27	33.56
T5DST	PPTOD-small	60	fine-tune	20.00	35.50	25.30	65.60	35.30	36.40
Prompter	PPTOD-small	3	frozen	19.20	35.80	26.00	66.30	39.00	37.20
T5-Adapter	T5-small	0.8^*	frozen	18.22	33.85	19.62	64.93	32.25	33.77
MoE4DST	T5-small	3	frozen	24.22	34.63	22.07	65.41	33.88	36.02
本文模型	PPTOD-small	10	frozen	18.50	34.50	31.50	67.20	46.50	39.64

[1]	许亮, 张春, 张宁, 田雪涛. 融合多Prompt模板的零样本关系抽取模型[J]. 《计算机应用》唯一官方网站, 2023, 43(12): 3668-3675.
[2]	徐戈, 肖永强, 汪涛, 陈开志, 廖祥文, 吴运兵. 基于视觉误差与语义属性的零样本图像分类[J]. 计算机应用, 2020, 40(4): 1016-1022.

模型	JGA
模型	Messaging	Payment	Trains	平均
SGD-baseline	10.20	11.50	13.60	11.77
Seq2Seq-DU	4.89	7.19	16.83	9.64
TransferQA	13.30	24.70	17.40	18.47
Prompter	42.99	21.90	54.84	39.91
MoE4DST	28.70	19.40	42.30	30.13
本文模型	43.64	36.41	55.49	45.18

模型	JGA
模型	Messaging	Payment	Trains	平均
SGD-baseline	10.20	11.50	13.60	11.77
Seq2Seq-DU	4.89	7.19	16.83	9.64
TransferQA	13.30	24.70	17.40	18.47
Prompter	42.99	21.90	54.84	39.91
MoE4DST	28.70	19.40	42.30	30.13
本文模型	43.64	36.41	55.49	45.18

模型	JGA
模型	Hotel	Attraction	Restaurant	Taxi	Train	平均
本文模型	18.50	34.50	31.50	67.20	46.50	39.64
w/o全局式前缀插入	16.70	30.45	27.540	66.32	37.36	35.67
w/o前缀门控机制	18.40	30.70	23.49	65.40	41.40	35.88
w/o跨域超级网络	16.20	33.80	26.80	66.70	40.90	36.88

模型	JGA
模型	Hotel	Attraction	Restaurant	Taxi	Train	平均
本文模型	18.50	34.50	31.50	67.20	46.50	39.64
w/o全局式前缀插入	16.70	30.45	27.540	66.32	37.36	35.67
w/o前缀门控机制	18.40	30.70	23.49	65.40	41.40	35.88
w/o跨域超级网络	16.20	33.80	26.80	66.70	40.90	36.88