计算机应用

• 典型应用(Typical applications) • 上一篇    下一篇

基于随机关键词产生技术的政府公文分类系统

刘颖 胡明涵   

  1. 辽宁金融职业学院
  • 收稿日期:2007-11-13 修回日期:1900-01-01 发布日期:2008-05-01 出版日期:2008-05-01
  • 通讯作者: 刘颖

Official document classification using stochastic keyword generation

<a href="http://www.joca.cn/EN/article/advancedSearchResult.do?searchSQL=(((Ying liu[Author]) AND 1[Journal]) AND year[Order])" target="_blank">Ying liu</a>   

  • Received:2007-11-13 Revised:1900-01-01 Online:2008-05-01 Published:2008-05-01
  • Contact: Ying liu

摘要: 设计并实现了带有主题词结构的政府公文分类系统,在公文分类预处理过程中充分利用主题词所携带的类别信息,运用随机关键词产生技术和Bootstrapping学习方法对公文文本特征空间进行转换并降维,实现了一个不同于传统的文本分类预处理过程,使公文分类系统的性能得到了提高。基于随机关键词产生技术和Bootstrapping 学习方法的公文分类系统分类效果优于普通分类器。

关键词: 公文分类, Bootstrapping, 随机关键词产生, 贝叶斯方法

Abstract: Design and implementation of a government official document classification system with topic phrase were presented. This system fully considered the value of topic phrase in the classification preprocessing, and made feature space transformation and dimension reduction by the stochastic keyword generation and the Bootstrapping. It differed from the traditional text classification preprocessing, and the performance of the official document classification system was improved. Official document classification using stochastic keyword generation outperforms other methods.

Key words: official document classification, Bootstrapping, stochastic keyword generation, Naï, ve Bayes