• •    

基于分层编码的深度增强学习对话生成研究

赵宇晴1,向阳2   

  1. 1. 同济大学
    2. 同济大学 电子与信息工程学院
  • 收稿日期:2017-04-28 修回日期:2017-06-02 发布日期:2017-06-02
  • 通讯作者: 向阳

Building hierarchical dialog generation model using deep reinforcement learning

Yu-Qing ZHAO,   

  • Received:2017-04-28 Revised:2017-06-02 Online:2017-06-02

摘要: 面向对话生成问题,提出了一种构建对话生成模型的方法,用以解决当前标准 seq2seq 结构采用最大似然函数 作为目标函数所带来的易生成通用回答的问题。在该方法中,其结合了分层编码和增强学习,利用分层编码来对多轮对话进 行建模,在标准 seq2seq 的基础上新增了中间层来加强对历史对话语句的记忆,而后采用了语言模型来构建奖励函数,进而 用增强学习中的策略梯度方法代替原有的最大似然损失函数进行训练。实验结果表明所提出的基于分层编码的深度增强学习 对话模型(Enhanced HRED)能生成语义信息更丰富的回答,在标准的人工测评指标中,其效果优于当前广泛采用的 RNN 系列 模型约 17%-21%。

关键词: 对话生成, 深度增强学习, 分层编码, 循环神经网络, 序列到序列

Abstract: Aimed at dialog generation problem, a dialog generation model is proposed to solve the problem that standard seq2seq architectures are more likely to raise highly generic responses due to the Maximum Likelihood Estimate(MLE)loss function. This method combines hierarchical encoding method and deep reinforcement learning, and uses hierarchical structure to build a multi-round dialog model, adding a hierarchical layer to enhance the memory of history dialog based on the standard seq2seq architecture, and then use a language model to build reward functions, replacing traditional MLE loss function with policy gradient method in deep reinforcement learning for training. Experimental results show that the proposed model (Enhanced HRED) can generate responses with richer semantic information and improve by 17%-21% in standard manual evaluation compared with widely used traditional RNN dialog generation models.

Key words: dialog generation, deep reinforcement learning, hierarchical encoding, recurrent neural network, sequence to sequence(seq2seq)

中图分类号: