《计算机应用》唯一官方网站 ›› 2025, Vol. 45 ›› Issue (7): 2221-2228.DOI: 10.11772/j.issn.1001-9081.2024060865

• 人工智能 • 上一篇    下一篇

基于语义前缀微调的零样本对话状态跟踪领域迁移模型

孙雨阳1, 张敏婕2, 胡婕1,3,4()   

  1. 1.湖北大学 计算机与信息工程学院,武汉 430062
    2.湖北大学 楚才学院,武汉 430062
    3.大数据智能分析与行业应用湖北省重点实验室(湖北大学),武汉 430062
    4.智慧政务与人工智能应用湖北省工程研究中心(湖北大学),武汉 430062
  • 收稿日期:2024-07-02 修回日期:2024-09-02 接受日期:2024-09-06 发布日期:2025-07-10 出版日期:2025-07-10
  • 通讯作者: 胡婕
  • 作者简介:孙雨阳(2003—),女,湖北孝感人,主要研究方向:自然语言处理
    张敏婕(2003—),女,湖北武汉人,主要研究方向:自然语言处理
    胡婕(1977—),女,湖北汉川人,教授,博士,主要研究方向:复杂语义大数据管理、自然语言处理。Jiehu@hubu.edu.cn
  • 基金资助:
    国家自然科学基金资助项目(61977021)

Zero-shot dialogue state tracking domain transfer model based on semantic prefix-tuning

Yuyang SUN1, Minjie ZHANG2, Jie HU1,3,4()   

  1. 1.School of Computer Science,Hubei University,Wuhan Hubei 430062,China
    2.Chucai Honors College,Hubei University,Wuhan Hubei 430062,China
    3.Hubei Key Laboratory of Big Data Intelligent Analysis and Application (Hubei University),Wuhan Hubei 430062,China
    4.Hubei Engineering Research Center of Intelligent Government Affairs and Artificial Intelligence Application (Hubei University),Wuhan Hubei 430062,China
  • Received:2024-07-02 Revised:2024-09-02 Accepted:2024-09-06 Online:2025-07-10 Published:2025-07-10
  • Contact: Jie HU
  • About author:SUN Yuyang, born in 2003. Her research interests include natural language processing.
    ZHANG Minjie, born in 2003. Her research interests include natural language processing.
    HU Jie,born in 1977, Ph. D., professor. Her research interests include complex semantic big data management, natural language processing.
  • Supported by:
    National Natural Science Foundation of China(61977021)

摘要:

零样本对话状态跟踪(DST)需要在缺乏标注数据时将已有模型迁移至新领域。现有的相关方法在执行领域迁移时常常难以捕捉对话文本中的上下文联系,导致相关模型在面对未知领域时的泛化能力不足。针对上述问题,提出一种基于语义前缀微调的零样本DST领域迁移模型。首先,利用槽位描述生成初始前缀,确保前缀与对话文本的紧密语义联系;其次,融合前缀位置与领域信息,生成能整合模型内部知识和领域信息的前缀;再次,根据对话内容的复杂性动态调整前缀长度,增强模型对上下文内容的敏感性;最后,通过全局式前缀插入增强模型对历史对话的全局记忆能力。实验结果表明,相较于Prompter模型,所提模型在MultiWOZ2.1数据集的Restaurant、Taxi和Train领域上的联合目标准确率(JGA)分别提高了5.50、0.90和7.50个百分点,在SGD数据集的Messaging、Payment和Trains领域上的JGA分别提高了0.65、14.51和0.65个百分点。可见,所提模型的零样本场景下DST任务的上下文理解能力和泛化迁移性能得到了有效提升。

关键词: 对话状态跟踪, 零样本学习, 领域迁移, 前缀微调, 参数高效迁移学习

Abstract:

Zero-shot Dialogue State Tracking (DST) requires transferring the existing models to new domains without labeled data. The existing related methods often struggle to capture contextual relationships in dialogue text during domain transfer, leading to insufficient generalization of the related models when facing unknown domains. To address this issue, a zero-shot DST domain transfer model based on semantic prefix-tuning was proposed. Firstly, the slot description was utilized to generate an initial prefix, thereby ensuring close semantic connection of the prefix with the dialogue text. Secondly, the prefix position and domain information were integrated to generate a prefix that combines internal knowledge and domain information. Thirdly, the prefix length was adjusted on the basis of the complexity of dialogue content dynamically to enhance the model’s sensitivity to contextual content. Finally, global prefix insertion was employed to enhance global memory ability of the model for dialogue history. Experimental results show that compared with Prompter model, the proposed model increases the Joint Goal Accuracy (JGA) by 5.50, 0.90 and 7.50 percentage points, respectively, in the Restaurant, Taxi and Train domains of MultiWOZ2.1 dataset, and by 0.65, 14.51 and 0.65 percentage points, respectively, in the Messaging, Payment and Trains domains of SGD dataset. It can be seen that the context understanding ability and generalization transfer performance of the proposed model in DST tasks in zero-shot scenarios are improved effectively.

Key words: Dialogue State Tracking (DST), zero-shot learning, domain transfer, prefix-tuning, Parameter-Efficient Transfer Learning (PETL)

中图分类号: