《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (5): 1430-1437.DOI: 10.11772/j.issn.1001-9081.2022040508

• 人工智能 • 上一篇    

基于槽位相关信息提取的对话状态追踪模型

石利锋, 倪郑威()   

  1. 浙江工商大学 信息与电子工程学院,杭州 310018
  • 收稿日期:2022-04-11 修回日期:2022-08-10 接受日期:2022-08-16 发布日期:2023-05-08 出版日期:2023-05-10
  • 通讯作者: 倪郑威
  • 作者简介:石利锋(1998—),男,浙江绍兴人,硕士研究生,CCF会员,主要研究方向:自然语言处理、机器学习
    倪郑威(1989—),男,湖北荆州人,副研究员,博士,CCF会员,主要研究方向:自然语言处理、机器学习。zhengwei.ni@zigsu.edu.cn
  • 基金资助:
    浙江省自然科学基金资助项目(LQ22F010008)

Dialogue state tracking model based on slot correlation information extraction

Lifeng SHI, Zhengwei NI()   

  1. School of Information and Electronic Engineering,Zhejiang Gongshang University,Hangzhou Zhejiang 310018,China
  • Received:2022-04-11 Revised:2022-08-10 Accepted:2022-08-16 Online:2023-05-08 Published:2023-05-10
  • Contact: Zhengwei NI
  • About author:SHI Lifeng, born in 1998, M. S. candidate. His research interests include natural language processing, machine learning.
    NI Zhengwei, born in 1989, Ph. D., associate research fellow. His research interests include natural language processing, machine learning.
  • Supported by:
    Natural Science Foundation of Zhejiang Province(LQ22F010008)

摘要:

对话状态追踪(DST)是任务型对话系统中一个重要的模块,但现有的基于开放词表的DST模型没有充分利用槽位的相关信息以及数据集本身的结构信息。针对上述问题,提出基于槽位相关信息提取的DST模型SCEL-DST(SCE and LOW for Dialogue State Tracking)。首先,构建槽位相关信息提取器(SCE),利用注意力机制学习槽位之间的相关信息;然后,在训练过程中应用学习最优样本权重(LOW)策略,在未大幅增加训练时间的前提下,加强模型对数据集信息的利用;最后,优化模型细节,搭建完整的SCEL-DST模型。实验结果表明,SCE和LOW对SCEL-DST模型性能的提升至关重要,该模型在两个实验数据集上均取得了更高的联合目标准确率,其中在MultiWOZ 2.3 (Wizard-of-OZ 2.3)数据集上与相同条件下的TripPy(Triple coPy)相比提升了1.6个百分点,在WOZ 2.0 (Wizard-of-OZ 2.0)数据集上与AG-DST (Amendable Generation for Dialogue State Tracking)相比提升了2.0个百分点。

关键词: 对话状态追踪, 注意力机制, 任务型对话, 课程学习, 预训练模型

Abstract:

Dialogue State Tracking (DST) is an important module in task-oriented dialogue systems, but the existing open-vocabulary-based DST models do not make full use of the slot correlation information as well as the structural information of the dataset itself. To solve the above problems, a new DST model named SCEL-DST (SCE and LOW for Dialogue State Tracking) was proposed based on slot correlation information extraction. Firstly, a Slot Correlation Extractor (SCE) was constructed, and the attention mechanism was used to learn the correlation information between slots. Then the Learning Optimal sample Weights (LOW) strategy was applied in the training process to enhance the model's utilization of the dataset information without substantial increase in training time. Finally, the model details were optimized to build the complete SCEL-DST model. Experimental results show that SCE and LOW are critical to the performance improvement of SCEL-DST model, making SCEL-DST achieve higher joint goal accuracy on both experimental datasets. The SCEL-DST model has the joint goal accuracy improved by 1.6 percentage points on the MultiWOZ 2.3 (Wizard-of-OZ 2.3) dataset compared to TripPy (Triple coPy) under the same conditions, and by 2.0 percentage points on the WOZ 2.0 (Wizard-of-OZ 2.0) dataset compared to AG-DST (Amendable Generation for Dialogue State Tracking).

Key words: Dialogue State Tracking (DST), attention mechanism, task-oriented dialogue, Curriculum Learning (CL), pre-trained model

中图分类号: