计算机应用 ›› 2016, Vol. 36 ›› Issue (2): 408-413.DOI: 10.11772/j.issn.1001-9081.2016.02.0408

• 第三届CCF大数据学术会议(CCF BigData 2015) • 上一篇    下一篇

基于灰色关联分析的中文新闻事件关联性识别

刘盼盼1, 洪旭东1, 郭剑毅1,2, 余正涛1,2, 文永华1,2, 陈玮1,2   

  1. 1. 昆明理工大学 信息工程与自动化学院, 昆明 650500;
    2. 昆明理工大学 智能信息处理重点实验室, 昆明 650500
  • 收稿日期:2015-08-29 修回日期:2015-09-17 出版日期:2016-02-10 发布日期:2016-02-03
  • 通讯作者: 郭剑毅(1964-),女,河南偃师人,教授,主要研究方向:自然语言处理、信息抽取、机器学习。
  • 作者简介:刘盼盼(1990-),女,山东济南人,硕士研究生,主要研究方向:自然语言处理、信息抽取;洪旭东(1989-),男,安徽马鞍山人,博士研究生,主要研究方向:自然语言处理、信息检索;余正涛(1970-),男,云南曲靖人,教授,博士,主要研究方向:自然语言处理、信息检索、机器学习、数据挖掘;文永华(1979-),男,云南大理人,博士研究生,主要研究方向:自然语言处理、信息抽取;陈玮(1983-),男,云南曲靖人,博士研究生,主要研究方向:信息检索、机器学习、信息抽取。
  • 基金资助:
    国家自然科学基金资助项目(61262041,61472168,61562052);云南省自然科学基金重点资助项目(2013FA030)。

Recognition of Chinese news event correlation based on grey relational analysis

LIU Panpan1, HONG Xudong1, GUO Jianyi1,2, YU Zhengtao1,2, WEN Yonghua1,2, CHEN Wei1,2   

  1. 1. Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming Yunnan 650500, China;
    2. Intelligent Information Processing Key Laboratory, Kunming University of Science and Technology, Kunming Yunnan 650500, China
  • Received:2015-08-29 Revised:2015-09-17 Online:2016-02-10 Published:2016-02-03

摘要: 针对中文新闻事件关联性识别准确率较低的问题,提出一种基于灰色关联分析(GRA)的中文新闻事件关联性识别算法,该算法是一种多因素分析法。首先,通过分析中文新闻事件的特性,提出三个影响事件关联性的因素,分别为触发词的共现性、事件的共享名词以及事件句的相似度;其次,对多个影响因素进行量化处理,计算每个影响因素的影响权值;最后,运用GRA将多个影响因素结合在一起,建立事件之间的灰色关联性分析模型,实现事件关联性识别。通过实验验证了三个影响因素对事件关联性识别的有效性,而且相对于只考虑单一影响因素的关联性识别算法,所提算法提高了事件关联性识别的准确率。

关键词: 事件关联性识别, 灰色关联分析, 多因素分析法, 共现性, 共享名词, 相似度

Abstract: Concerning the low accuracy of identifying relevant Chinese events, a correlation recognition algorithm for Chinese news events based on Grey Relational Analysis (GRA) was proposed, which is a multiple factor analysis method. Firstly, three factors that affect the event correlation, including co-occurrence of triggers, shared nouns between events and the similarity of the event sentences, were proposed through analyzing the characteristics of Chinese news events. Secondly, the three factors were quantified and the influence weights of them were calculated. Finally, GRA was used to combine the three factors, and the GRA model between events was established to realize event correlation recognition. The experimental results show that the three factors for event correlation recognition are effective, and compared with the method only using one influence factor, the proposed algorithm improves the accuracy of event correlation recognition.

Key words: event correlation recognition, Grey Relational Analysis(GRA), multi-factor analysis method, co-occurrence, shared noun, similarity

中图分类号: