Journal of Computer Applications ›› 2018, Vol. 38 ›› Issue (7): 1839-1845.DOI: 10.11772/j.issn.1001-9081.2017122996

Previous Articles     Next Articles

Multi-intention recognition model with combination of syntactic feature and convolution neural network

YANG Chunni1, FENG Chaosheng1,2   

  1. 1. School of Computer Science, Sichuan Normal University, Chengdu Sichuan 610101, China;
    2. School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu Sichuan 610054, China
  • Received:2017-12-21 Revised:2018-02-08 Online:2018-07-10 Published:2018-07-12
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (61373163), the National Key Technology Support Program (2014BAH11F02, 2014BAH11F01).

结合句法特征和卷积神经网络的多意图识别模型

杨春妮1, 冯朝胜1,2   

  1. 1. 四川师范大学 计算机科学学院, 成都 610101;
    2. 电子科技大学 信息与软件工程学院, 成都 610054
  • 通讯作者: 冯朝胜
  • 作者简介:杨春妮(1994-),女,四川宜宾人,硕士研究生,主要研究方向:自然语言处理、文本挖掘;冯朝胜(1971-),男,四川成都人,教授,博士,CCF高级会员,主要研究方向:云计算与大数据、隐私保护、数据安全。
  • 基金资助:
    国家自然科学基金资助项目(61373163);国家科技支撑计划项目(2014BAH11F02,2014BAH11F01)。

Abstract: Multi-Intention (MI) recognition of short texts is a problem in Spoken Language Understanding (SLU). The effective features of short texts are difficult to extract in classification problems because of sparse features of short texts and few words containing many information. To solve the problem, by combining syntactic features and Convolution Neural Network (CNN), a multi-intention recognition model was proposed. Firstly, the sentence was syntactically analyzed to determine whether it contains multi-intention. Secondly, the number of intentions and matrix of distance were calculated by using Term Frequency-Inverse Document Frequency (TF-IDF) and word embedding. Then the matrix of distance was acted as the input of CNN model to classify intentions. Finally, the emotional polarity of each intention was judged to return to the user's true intentions. The experiment was carried out by using the real data of the existing intelligent customer service system. The experimental results show that, the single classification precision of the combination model of syntactic features and CNN is 93.5% in 10 intentions, which is 1.4 percentage points higher than the original CNN model without syntactic features. And in mutil-intention recognition, the classification precision is 30 percentage points higher than others.

Key words: Spoken Language Understanding (SLU), Multi-Intention (MI) recognition, syntactic feature, Convolution Neural Network (CNN), natural language

摘要: 短文本的多意图识别是口语理解(SLU)中的难题,因短文本的特征稀疏、字数少但包含信息量大,在分类问题中难以提取其有效特征。为解决该问题,将句法特征和卷积神经网络(CNN)进行结合,提出一种多意图识别模型。首先,将句子进行依存句法分析以确定是否包含多意图;然后,利用词频-逆文档频率(TF-IDF)和训练好的词向量计算距离矩阵,以确定意图的个数;其次,把该距离矩阵作为CNN模型的输入,进行意图分类;最后,判断每个意图的情感极性,计算用户的真实意图。采用现有的智能客服系统的真实数据进行实验,实验结果表明,结合句法特征的CNN模型在10个意图上的单分类精准率达到93.5%,比未结合句法特征的CNN模型高1.4个百分点;而在多意图识别上,精准率比其他模型提高约30个百分点。

关键词: 口语理解, 多意图识别, 句法特征, 卷积神经网络, 自然语言

CLC Number: