计算机应用 ›› 2014, Vol. 34 ›› Issue (4): 1109-1113.DOI: 10.11772/j.issn.1001-9081.2014.04.1109

• 人工智能 • 上一篇    下一篇

融合词义信息的中文短语句法分析

耿立飞1,2,李红莲1,吕学强2,吴云芳3   

  1. 1. 北京信息科技大学 信息与通信工程学院,北京 100101;
    2. 网络文化与数字传播北京市重点实验室(北京信息科技大学),北京 100101;
    3. 北京大学 计算语言学研究所,北京 100871
  • 收稿日期:2013-09-02 修回日期:2013-11-22 出版日期:2014-04-01 发布日期:2014-04-29
  • 通讯作者: 耿立飞
  • 作者简介:耿立飞(1987-),男,河北邢台人,硕士研究生,CCF会员,主要研究方向:中文信息处理;
    李红莲(1971-),男,河北保定人,副教授,博士,主要研究方向:统计学习;
    吕学强(1970-),男,山东鱼台人,教授,博士,CCF会员,主要研究方向:中文信息处理、多媒体信息处理。
  • 基金资助:

    国家自然科学基金资助项目;北京市教委科技发展计划重点项目暨北京市自然科学基金B类重点项目;北京信息科技大学网络文化与数字传播北京市重点实验室开放课题

Chinese phrase parsing with semantic information

GENG Lifei1,2,LI Honglian2,LYU Xueqiang1,WU Yunfang3   

  1. 1. Beijing Key Laboratory of Internet Culture and Digital Dissemination Research (Beijing Information Science and Technology University), Beijing 100101, China
    2. School of Information and Communication Engineering, Beijing Information Science and Technology University, Beijing 100101, China
    3. Institute of Computational Linguistics, Peking University, Beijing 100871, China
  • Received:2013-09-02 Revised:2013-11-22 Online:2014-04-01 Published:2014-04-29
  • Contact: GENG Lifei

摘要:

针对目前融合词义信息的短语句法分析过程中,多义词词义消歧较差的问题,提出一种基于词性消歧的中文短语句法分析方法。首先构建具有词性信息的同义词字典;然后对训练集和测试集中的词语进行词义替换,利用多义词的词性区分其不同的词义。在宾州中文树库(CTB)的实验结果表明,正确率为80.30%,召回率为78.12%,F值为79.19%。相对于没有进行词性消歧的系统,该方法有效提高了短语句法分析的性能。

Abstract:

To deal with the poor performance of word sense disambiguation in parsing, a Chinese phrase parsing approach was proposed based on disambiguation of Chinese part of speech. First, it expanded part of speech of TongYiCi CiLin and then substituted the original words in the training set and test set with semantics codes. In this process, it used part of speech of word for word sense disambiguation. The experimental results on Penn Chinese TreeBank (CTB) show that the proposed method achieves precision rate of 80.30%, recall rate of 78.12%, and F-measure of 79.19%. Relative to the no disambiguation system, the presented approach can effectively improve the performance of phrase parsing.

中图分类号: