计算机应用 ›› 2010, Vol. 30 ›› Issue (07): 1941-1943.

• 数据库技术 • 上一篇    下一篇

面向GIS基于专有名词优先的中文分词方法

罗浩1,魏祖宽2,金在弘3   

  1. 1. 成都市电子科技大学
    2. 成都电子科技大学
    3. 韩国永同大学校
  • 收稿日期:2010-01-20 修回日期:2010-03-02 发布日期:2010-07-01 出版日期:2010-07-01
  • 通讯作者: 罗浩

Chinese word segmentation for GIS based on priority special name

  • Received:2010-01-20 Revised:2010-03-02 Online:2010-07-01 Published:2010-07-01

摘要: 提出了一种面向地理信息系统领域的基于专有名词优先的中文分词方法:利用专业词典、通用词典和同义词词典相结合的词典机制,优先切分专有名词,对粗分结果利用Trigram模型进行消歧而获取最终结果。实验证明,该分词算法对专业文献的分词处理具有较好速度和准确性。

关键词: 中文分词, 专业词典, Trigram模型, 同义词词典

Abstract: A Chinese word segmentation algorithm for Geographic Information System (GIS) based on priority special name was designed: use dictionary mechanism which combines synonyms dictionary, general dictionary and special dictionary, cut the sentences by special name firstly, and get the segmentation result of disambiguating with Trigram mode lastly. The experimental results show that the segmentation algorithm has good speed and accuracy in segmentation processing of professional literature.

Key words: Chinese Word Segmentation, Special dictionary, Trigram Mode, Synonyms dictionary