计算机应用 ›› 2012, Vol. 32 ›› Issue (06): 1685-1687.DOI: 10.3724/SP.J.1087.2012.01685

• 人工智能 • 上一篇    下一篇

基于句法分析和二次贝叶斯模型的受限域问题分类

嵇宇,王荣波,谌志群   

  1. 杭州电子科技大学计算机学院计算机应用技术研究所
  • 收稿日期:2011-12-16 修回日期:2012-02-16 发布日期:2012-06-04 出版日期:2012-06-01
  • 通讯作者: 嵇宇
  • 作者简介:嵇宇(1988-),男,陕西宝鸡人,硕士研究生,主要研究方向:中文问句理解;〓王荣波(1978-),男,浙江义乌人,副教授,博士,主要研究方向:自然语言处理、舆情分析;〓谌志群(1973-),男,江西南昌人,副教授,博士,主要研究方向:中文信息处理、问答系统。
  • 基金资助:
    浙江省自然科学基金资助项目

Question classification in restricted domain using syntactic parsing-based quadratic-Bayesian model

JI Yu,WANG Rong-bo,CHEN Zhi-qun   

  1. Institute of Computer Application Technology, Hangzhou Dianzi University, Hangzhou Zhejiang 310018, China
  • Received:2011-12-16 Revised:2012-02-16 Online:2012-06-04 Published:2012-06-01
  • Contact: JI Yu

摘要: 针对受限域的特殊性,提出了一种基于句法分析和二次贝叶斯模型的问题分类的新方法。该方法首先利用浅层句法分析的结果,抽取问题的主干部分和疑问词及其附属成分作为分类的特征,大大减少了噪声;然后,提出一种适用于受限域问题分类的改进的二次贝叶斯分类模型,并利用这一模型进行了大量的实验。实验结果表明了这一方法在受限域内的有效性,大类与小类问题的平均分类精度分别达到了89.66%和84.13%。

关键词: 问题分类, 二次贝叶斯模型, 问答系统, 句法分析

Abstract: In this paper, a new method using syntactic parsing-based quadratic-bayesian model was proposed to perform question classification in Chinese restricted domain. In this method, firstly, the shallow syntactic parsing on Chinese question sentences is performed. Secondly, the subject-predicate structures of all parsed question sentences, as well as interrogative words and their adjunctive parts, are extracted as the features in our constructed classifier, which greatly reduces the noise information. Thirdly, an advanced Quadratic-Bayesian classification model for question classification in restricted domain is constructed. The experimental results show that the proposed question classification method is feasible in restricted domain with the average classification precisions of coarse classes and fine classes reach 89.66% and 84.13% respectively.

Key words: question classification, quadratic-Bayesian model, question answering system, syntactic parsing