计算机应用 ›› 2012, Vol. 32 ›› Issue (08): 2238-2244.DOI: 10.3724/SP.J.1087.2012.02238

• 人工智能 • 上一篇    下一篇

词汇语义信息对中文实体关系抽取影响的比较

刘丹丹,彭成,钱龙华,周国栋   

  1. 苏州大学 计算机科学与技术学院,江苏 苏州 215006
  • 收稿日期:2012-02-21 修回日期:2012-04-10 发布日期:2012-08-28 出版日期:2012-08-01
  • 通讯作者: 钱龙华
  • 作者简介:刘丹丹(1987-),女,山东滕州人,硕士研究生,主要研究方向:信息抽取;
    彭成(1987-),男,安徽六安人,硕士研究生,主要研究方向:信息抽取;
    钱龙华(1966-),男,江苏苏州人,副教授,CCF会员,主要研究方向:自然语言处理;
    周国栋(1967-),男,江苏溧阳人,教授,博士生导师,CCF高级会员,主要研究方向:自然语言处理。
  • 基金资助:
    国家自然科学基金资助项目(60873150,90920004);江苏省自然科学基金资助项目(BK2010219,11KJA520003)

Comparative analysis of impact of lexical semantic information on Chinese entity relation extraction

LIU Dan-dan,PENG Cheng,QIAN Long-hua,ZHOU Guo-dong   

  1. School of Computer Science and Technology, Soochow University, Suzhou Jiangsu 215006, China
  • Received:2012-02-21 Revised:2012-04-10 Online:2012-08-28 Published:2012-08-01
  • Contact: QIAN Long-hua

摘要: 提出一种将《同义词词林》和《知网》的语义信息融合到基于树核函数的中文关系抽取方法,并比较和分析了两种语义信息对中文实体关系抽取的影响,同时探讨了这两种语义信息与实体类型信息之间的相互关系。实验结果表明,该方法能在一定程度上提高中文关系抽取的性能;同时,《同义词词林》能补充实体类型信息的不足,因而无论是否加入实体类型信息,其语义信息都能大幅度地提高大部分关系类型的抽取性能;而《知网》则和实体类型信息存在冲突,因此在已知实体类型信息的前提下,仅能提高个别关系类型的抽取性能。

关键词: 中文实体关系抽取, 树核, 《同义词词林》, 《知网》, 语义信息

Abstract: A method was proposed to incorporate semantic information based on TongYiCi CiLin and HowNet into tree kernel-based Chinese relation extraction, the impact of these two kinds of semantic information on Chinese entity relation extraction was compared and analyzed, and the interrelation between lexical semantic information and entity type information was explored. The experimental results show that this method can improve the performance of Chinese relation extraction in some degree, and TongYiCi CiLin can complement the entity type information to a certain extent. Therefore, no matter whether the entity type information is involved or not, its semantic information can significantly improve the extraction performance for most of the relation types, while some conflicts exist between HowNet and the entity type information, leading to its performance improvements only for several relation types when entity types are provided.

Key words: Chinese entity relation extraction, tree kernel, TongYiCi CiLin, HowNet, semantic information

中图分类号: