计算机应用 ›› 2015, Vol. 35 ›› Issue (4): 1006-1008.DOI: 10.11772/j.issn.1001-9081.2015.04.1006

• 人工智能 • 上一篇    下一篇

基于上下文语境的词义消歧方法

杨陟卓   

  1. 山西大学 计算机科学与信息技术学院, 太原 030006
  • 收稿日期:2014-11-15 修回日期:2015-01-14 出版日期:2015-04-10 发布日期:2015-04-08
  • 通讯作者: 杨陟卓
  • 作者简介:杨陟卓(1983-),男,山西临汾人, 讲师,博士,主要研究方向:自然语言处理、数据挖掘。
  • 基金资助:

    国家自然科学基金资助项目(61403238);山西省自然科学基金资助项目(2014021022-1)。

Word sense disambiguation method based on knowledge context

YANG Zhizhuo   

  1. School of Computer and Information Technology, Shanxi University, Taiyuan Shanxi 030006, China
  • Received:2014-11-15 Revised:2015-01-14 Online:2015-04-10 Published:2015-04-08

摘要:

针对传统词义消歧方法面临的数据稀疏问题,提出一种基于上下文语境的词义消歧方法。该方法假设同一篇文章中的句子之间共享一些相同的话题,首先,抽取在同一篇文章中包含相同歧义词的句子,这些句子可以作为歧义句的上下文语境,为其中的一个歧义句子提供消歧知识;其次,通过一种无监督的词义消歧方法进行词义消歧。在真实的语料上实验结果表明,使用2个上下文语境句子,窗口大小为1时,该方法的消歧准确率比基线方法(OrigDisam)提高了3.26%。

关键词: 数据稀疏, 词义消歧, 上下文语境, 网络图模型, 参数估计

Abstract:

In order to overcome the data sparseness problem of traditional Word Sense Disambiguation (WSD) methods, a new WSD method based on knowledge context was proposed. The method is based on the assumption that sentences within one article share some common topics. Fisrt, similarity algorithm was used to obtain sentences with the same ambiguous words in the article, and those sentences could be appropriate knowledge context for ambiguous sentences and provided disambiguation knowledge. Then a graph-based ranking algorithm was used for WSD. The experimental results of real data show that, when there are two knowledge context sentences and the window size is 1, the disambiguation accuracy of this method is increased by 3.2% compared to the baseline method (OrigDisam).

Key words: data sparseness, Word Sense Disambiguation (WSD), knowledge context, graph based model, parameter estimation

中图分类号: