计算机应用 ›› 2018, Vol. 38 ›› Issue (8): 2170-2174.DOI: 10.11772/j.issn.1001-9081.2018010190

• 人工智能 • 上一篇    下一篇

聊天机器人中用户就医意图识别方法

余慧1, 冯旭鹏2, 刘利军1, 黄青松1,3   

  1. 1. 昆明理工大学 信息工程与自动化学院, 昆明 650500;
    2. 昆明理工大学 教育技术与网络中心, 昆明 650500;
    3. 云南省计算机技术应用重点实验室(昆明理工大学), 昆明 650500
  • 收稿日期:2018-01-23 修回日期:2018-03-28 出版日期:2018-08-10 发布日期:2018-08-11
  • 通讯作者: 黄青松
  • 作者简介:余慧(1993-),女,四川成都人,硕士研究生,主要研究方向:自然语言处理、医疗信息服务;冯旭鹏(1986-),男,河南郑州人,助理实验师,硕士,主要研究方向:信息检索;刘利军(1978-),男,河南新乡人,讲师,硕士,主要研究方向:医疗信息服务;黄青松(1962-),男,湖南长沙人,教授,硕士,主要研究方向:智能信息系统、信息检索。
  • 基金资助:
    国家自然科学基金资助项目(81360230,81560296)。

Identification method of user's medical intention in chatting robot

YU Hui1, FENG Xupeng2, LIU Lijun1, HUANG Qingsong1,3   

  1. 1. Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming Yunnan 650500, China;
    2. Educational Technology and Network Center, Kunming University of Science and Technology, Kunming Yunnan 650500, China;
    3. Yunnan Provincial Key Laboratory of Computer Technology Applications(Kunming University of Science and Technology), Kunming Yunnan 650500, China
  • Received:2018-01-23 Revised:2018-03-28 Online:2018-08-10 Published:2018-08-11
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (81360230, 81560296).

摘要: 传统的聊天机器人中用户意图识别一般采用基于模板匹配或人工特征集合等方法,针对其费时费力而且扩展性不强的问题,并结合医疗领域聊天文本的特点,提出了基于短文本主题模型(BTM)和双向门控循环单元(BiGRU)的意图识别模型。该混合模型将用户就医意图识别看作分类问题,使用主题特征,首先通过BTM对用户聊天文本逐句进行主题挖掘并量化,然后送入BiGRU进行完整上下文学习得到连续语句最终表示,最后通过分类完成用户就医意图识别。对爬取的语料进行实验,BTM-BiGRU方法明显优于传统的支持向量机(SVM)等方法,其F值更是高出目前较好的卷积长短期记忆组合神经网络(CNN-LSTM)近1.5个百分点。实验结果表明,在本任务上该混合模型重点考虑研究对象的特点,能有效提高意图识别的准确率。

关键词: 就医意图识别, 医疗聊天文本, 短文本主题模型, 双向门控循环单元, 模板匹配

Abstract: Traditional user intention recognition methods in chatting robot are usually based on template matching or artificial feature sets. To address the problem that those methods are difficult, time-consuming but have a week extension, an intention recognition model based on Biterm Topic Model (BTM) and Bidirectional Gated Recurrent Unit (BiGRU) was proposed with considering the features of the chatting texts about health. The identification of user's medical intention was regarded as a classification problem and topic features were used in the hybrid model. Firstly, the topic of user's every chatting sentence was mined by BTM with quantification. Then last step's results were fed into BiGRU to do context-based learning for getting the final representation of user's continuous statements. At last, the task was finished by making classification. In the comparison experiments on crawling corpus, the BTM-BiGRU model obviously outperforms to other traditional methods such as Support Vector Machine (SVM), even the F value approximately increses by 1.5 percentage points compared to the state-of-the-art model combining Convolution Neural Network and Long-Short Term Memory Network (CNN-LSTM). Experimental results show that the proposed method can effectively improve the accuracy of the intention recognition focusing on characteristics of the study.

Key words: identification of medical intention, medical chatting text, biterm topic model, bidirectional gated recurrent unit, template matching

中图分类号: