基于上下文语境的词义消歧方法

doi:10.11772/j.issn.1001-9081.2015.04.1006

计算机应用 ›› 2015, Vol. 35 ›› Issue (4): 1006-1008.DOI: 10.11772/j.issn.1001-9081.2015.04.1006

基于上下文语境的词义消歧方法

杨陟卓

山西大学计算机科学与信息技术学院, 太原 030006

收稿日期:2014-11-15 修回日期:2015-01-14 发布日期:2015-04-08 出版日期:2015-04-10
通讯作者: 杨陟卓
作者简介:杨陟卓(1983-),男,山西临汾人, 讲师,博士,主要研究方向:自然语言处理、数据挖掘。
基金资助:
国家自然科学基金资助项目(61403238);山西省自然科学基金资助项目(2014021022-1)。

Word sense disambiguation method based on knowledge context

YANG Zhizhuo

School of Computer and Information Technology, Shanxi University, Taiyuan Shanxi 030006, China

Received:2014-11-15 Revised:2015-01-14 Online:2015-04-08 Published:2015-04-10

摘要/Abstract

摘要：

针对传统词义消歧方法面临的数据稀疏问题,提出一种基于上下文语境的词义消歧方法。该方法假设同一篇文章中的句子之间共享一些相同的话题,首先,抽取在同一篇文章中包含相同歧义词的句子,这些句子可以作为歧义句的上下文语境,为其中的一个歧义句子提供消歧知识;其次,通过一种无监督的词义消歧方法进行词义消歧。在真实的语料上实验结果表明,使用2个上下文语境句子,窗口大小为1时,该方法的消歧准确率比基线方法(OrigDisam)提高了3.26%。

关键词: 数据稀疏, 词义消歧, 上下文语境, 网络图模型, 参数估计

Abstract:

In order to overcome the data sparseness problem of traditional Word Sense Disambiguation (WSD) methods, a new WSD method based on knowledge context was proposed. The method is based on the assumption that sentences within one article share some common topics. Fisrt, similarity algorithm was used to obtain sentences with the same ambiguous words in the article, and those sentences could be appropriate knowledge context for ambiguous sentences and provided disambiguation knowledge. Then a graph-based ranking algorithm was used for WSD. The experimental results of real data show that, when there are two knowledge context sentences and the window size is 1, the disambiguation accuracy of this method is increased by 3.2% compared to the baseline method (OrigDisam).

Key words: data sparseness, Word Sense Disambiguation (WSD), knowledge context, graph based model, parameter estimation

中图分类号:

TP391

杨陟卓. 基于上下文语境的词义消歧方法[J]. 计算机应用, 2015, 35(4): 1006-1008.

YANG Zhizhuo. Word sense disambiguation method based on knowledge context[J]. Journal of Computer Applications, 2015, 35(4): 1006-1008.

参考文献

[1] NAVIGLI R. Word sense disambiguation: a survey [J]. ACM Computing Surveys, 2009, 41(2): 1-69.
[2] CHAN Y S, NG H T. Scaling up word sense disambiguation via parallel texts[C]// AAAI 2005: Proceedings of the 20th National Conference on Artificial Intelligence. Menlo Park: AAAI Press, 2005,3:1037-1042.
[3] PILEHVAR M T, JURGENS D, NAVIGLI R. Align, disambiguate and walk: a unified approach for measuring semantic similarity[C]// Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2013,1:1341-1351.
[4] NAVIGLI R, PONZETTO S P. Joining forces pays off: Multilingual joint word sense disambiguation[C]// Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Stroudsburg: Association for Computational Linguistics, 2012: 1399-1410.
[5] LU W, HUANG H. Word sense disambiguation based on dependency fitness with automatic knowledge acquisition[J]. Journal of Software, 2013, 24(10): 2300-2311.(鹿文鹏, 黄河燕. 基于依存适配度的知识自动获取词义消歧方法[J]. 软件学报, 2013, 24(10): 2300-2311.)
[6] STEVENSON M, AGIRRE E, SOROA A. Exploiting domain information for word sense disambiguation of medical documents[J]. Journal of the American Medical Informatics Association, 2012,19(2):235-240.
[7] AGIRRE E, de LACALLE O L, SOROA A. Random walks for knowledge-based word sense disambiguation[J]. Computational Linguistics, 2014, 40(1): 57-84.
[8] NAVIGLI R, LAPATA M. An experimental study of graph connectivity for unsupervised word sense disambiguation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(4): 678-692.
[9] MOHLER M, BUNESCU R, MIHALCEA R. Learning to grade short answer questions using semantic similarity measures and dependency graph alignments[C]// HLT 2011: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg: Association for Computational Linguistics, 2011,1: 752-762.
[10] HASSAN S, MIHALCEA R. Semantic relatedness using salient semantic analysis[C]// Proceedings of AAAI Conference on Artificial Intelligence. Menlo Park: AAAI Press, 2011.
[11] YANG Z, HUANG H. Graph based word sense disambiguation method using distance between words [J]. Journal of Software, 2012,23(4):776-785.(杨陟卓, 黄河燕. 基于词语距离的网络图词语消歧[J]. 软件学报,2012,23(4):776-785)
[12] YANG Z, HUANG H. WSD method based on heterogeneous relation graph[J]. Journal of Computer Research and Development, 2013, 50(2): 437-444.(杨陟卓, 黄河燕. 基于异构关系网络图的词义消歧研究[J]. 计算机研究与发展, 2013, 50(2): 437-444.)
[13] PAGE L, BRIN S, MOTWANI R, et al. The PageRank citation ranking: Bringing order to the Web[R/OL]. [2012-10-10]. http://ilpubs.stanford.edu:8090/422/1/1999-66.pdf.
[14] JIN P, WU Y, YU S. Semeval 2007 task 05: multilingual Chinese-English lexical sample[C]// Proceedings of the 4th International Workshop on Semantic Evaluations. Stroudsburg: Association for Computational Linguistics, 2007: 19-23.

[1]	马源源, 解蕾蕾, 董南, 刘娜. 考虑用户能动性和流动性的舆情传播模型[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 619-627.
[2]	杨晓菡, 郝国生, 张谢华, 杨子豪. 基于协同训练与Boosting的协同过滤算法[J]. 《计算机应用》唯一官方网站, 2023, 43(10): 3136-3141.
[3]	欧跃发, 杨鸣坤, 慕德俊, 柯捷, 马文涛. 基于广义最大Versoria准则的稀疏自适应滤波算法[J]. 《计算机应用》唯一官方网站, 2021, 41(11): 3325-3331.
[4]	徐星辰, 程剑, 唐璟宇, 张剑. 非对称成对载波多址信号的相位误差分析及幅度改进算法[J]. 计算机应用, 2019, 39(4): 1138-1144.
[5]	王波, 刘德亮. 基于迭代自适应方法的近场源二维参数联合估计[J]. 计算机应用, 2019, 39(2): 523-527.
[6]	文凯, 朱传亮. 融合社交网络和兴趣的正则化矩阵分解推荐模型[J]. 计算机应用, 2018, 38(9): 2523-2528.
[7]	乔永卫, 张宇翔, 肖春景. 基于会话时序相似性的矩阵分解数据填充[J]. 计算机应用, 2018, 38(8): 2236-2242.
[8]	袁正午, 陈然. 基于多层次混合相似度的协同过滤推荐算法[J]. 计算机应用, 2018, 38(3): 633-638.
[9]	余可钦, 吴映波, 李顺, 蒋佳成, 向德, 王天慧. 基于移动用户上下文相似度的张量分解推荐算法[J]. 计算机应用, 2017, 37(9): 2531-2535.
[10]	刘江冬, 梁刚, 冯程, 周泓宇. 基于信息熵和时效性的协同过滤推荐[J]. 计算机应用, 2016, 36(9): 2531-2534.
[11]	曹帅, 王布宏, 刘新波, 沈海鸥. 基于Earley算法的多功能雷达文法概率快速学习算法[J]. 计算机应用, 2016, 36(9): 2636-2641.
[12]	张南, 林晓勇, 史晟辉. 基于改进型启发式相似度模型的协同过滤推荐方法[J]. 计算机应用, 2016, 36(8): 2246-2251.
[13]	王董礼, 曹鹏, 黄国策, 孙启禄, 李连宝. 基于隐马尔可夫模型的短波认知频率选择方法[J]. 计算机应用, 2016, 36(5): 1179-1182.
[14]	肖晓丽, 钱娅丽, 李旦江, 谭柳斌. 基于用户兴趣和社交信任的聚类推荐算法[J]. 计算机应用, 2016, 36(5): 1273-1278.
[15]	吴章平, 刘本永. 基于灰度平均梯度和粒子群优化的散焦图像模糊参数估计[J]. 计算机应用, 2016, 36(4): 1111-1114.

基于上下文语境的词义消歧方法

Word sense disambiguation method based on knowledge context

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics