《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (10): 2975-2989.DOI: 10.11772/j.issn.1001-9081.2021081542

• 人工智能 •    下一篇

事件抽取综述

马春明1, 李秀红1, 李哲2, 王惠茹1, 杨丹1   

  1. 1.新疆大学 信息科学与工程学院,乌鲁木齐 830046
    2.香港理工大学 电子及资讯工程学系,香港 999077
  • 收稿日期:2021-08-31 修回日期:2021-12-08 接受日期:2021-12-09 发布日期:2022-10-14 出版日期:2022-10-10
  • 通讯作者: 李秀红
  • 作者简介:第一联系人:马春明(1997—),男,四川绵阳人,硕士研究生,主要研究方向:自然语言处理、事件抽取
    李秀红(1977—),女,山东威海人,副教授,博士,主要研究方向:自然语言处理、图像处理 xjulxh@xju.edu.cn
    李哲(1992—),男,山东泰安人,博士研究生,主要研究方向:说话人识别、多模态语义分析
    王惠茹(1996—),女,新疆伊犁人,硕士研究生,主要研究方向:自然语言处理、图像处理
    杨丹(1996—),女,四川南充人,硕士研究生,主要研究方向:自然语言处理、图像处理。
  • 基金资助:
    国家语委科研重点项目(ZDI135-96)

Survey of event extraction

Chunming MA1, Xiuhong LI1, Zhe LI2, Huiru WANG1, Dan YANG1   

  1. 1.College of Information Science and Engineering,Xinjiang University,Urumqi Xinjiang 830046,China
    2.Department of Electronic and Information Engineering,The Hong Kong Polytechnic University,Hong Kong 999077,China
  • Received:2021-08-31 Revised:2021-12-08 Accepted:2021-12-09 Online:2022-10-14 Published:2022-10-10
  • Contact: Xiuhong LI
  • About author:MA Chunming, born in 1997, M. S. candidate. His research interests include natural language processing, event extraction.
    LI Xiuhong, born in 1977, Ph. D. , associate professor. Her research interests include natural language processing, image processing.
    LI Xiuhong, born in 1977, Ph. D. , associate professor. Her research interests include natural language processing, image processing.
    LI Zhe, born in 1992, Ph. D. candidate. His research interests include speaker recognition, multi-modal semantic analysis.
    WANG Huiru, born in 1996, M. S. candidate. Her research interests include natural language processing, image processing.
    YANG Dan, born in 1996, M. S. candidate. Her research interests include natural language processing, image processing.
  • Supported by:
    Scientific Research Key Project of National Language Commission(ZDI135-96)

摘要:

将用户感兴趣的事件从非结构化信息中提取出来,然后以结构化的方式展示给用户,这就是事件抽取。事件抽取在信息收集、信息检索、文档合成、信息问答等方面有着广泛应用。从全局出发,事件抽取算法可以分为基于模式匹配的算法、触发词法、基于本体的算法以及前沿联合模型方法这四类。在研究过程中根据相关需求可使用不同评价方法和数据集,而不同的事件表示方法也与事件抽取研究有一定联系;以任务类型区分,元事件抽取和主题事件抽取是事件抽取的两大基本任务。其中,元事件抽取有基于模式匹配、基于机器学习和基于神经网络这三种方式,而主题事件抽取有基于事件框架和基于本体两种方式。事件抽取研究在中英等单语言上均已取得了优秀成果,而跨语言事件抽取依然面临着许多问题。最后,总结了事件抽取的相关工作并提出未来研究方向,以期为后续研究提供参考。

关键词: 事件抽取, 事件表示, 元事件抽取, 主题事件抽取, 跨语言事件抽取

Abstract:

The event that the user is interested in is extracted from the unstructured information, and then displayed to the user in a structured way, that is event extraction. Event extraction has a wide range of applications in information collection, information retrieval, document synthesis, and information questioning and answering. From the overall perspective, event extraction algorithms can be divided into four categories: pattern matching algorithms, trigger lexical methods, ontology-based algorithms, and cutting-edge joint model methods. In the research process, different evaluation methods and datasets can be used according to the related needs, and different event representation methods are also related to event extraction research. Distinguished by task type, meta-event extraction and subject event extraction are the two basic tasks of event extraction. Among them, meta-event extraction has three methods based on pattern matching, machine learning and neural network respectively, while there are two ways to extract subjective events: based on the event framework and based on ontology respectively. Event extraction research has achieved excellent results in single languages such as Chinese and English, but cross-language event extraction still faces many problems. Finally, the related works of event extraction were summarized and the future research directions were prospected in order to provide guidelines for subsequent research.

Key words: event extraction, event representation, meta-event extraction, subject event extraction, cross-language event extraction

中图分类号: