《计算机应用》唯一官方网站 ›› 2021, Vol. 41 ›› Issue (12): 3534-3539.DOI: 10.11772/j.issn.1001-9081.2021060928

• 第十八届中国机器学习会议(CCML 2021) • 上一篇    

融合句法信息的无触发词事件检测方法

汪翠1,2, 张亚飞1,2(), 郭军军1,2, 高盛祥1,2, 余正涛1,2   

  1. 1.昆明理工大学 信息工程与自动化学院,昆明 650504
    2.云南省人工智能重点实验室(昆明理工大学),昆明 650504
  • 收稿日期:2021-05-12 修回日期:2021-06-17 接受日期:2021-07-04 发布日期:2021-12-28 出版日期:2021-12-10
  • 通讯作者: 张亚飞
  • 作者简介:汪翠(1997—),女,云南昭通人,硕士研究生,CCF会员,主要研究方向:自然语言处理、事件检测
    郭军军(1987—) ,男,山西吕梁人,副教授,博士,主要研究方向:自然语言处理、机器翻译
    高盛祥(1977—),女,云南大理人,副教授,博士,主要研究方向:自然语言处理、机器翻译
    余正涛(1970—),男,云南曲靖人,教授,博士,主要研究方向:自然语言处理、机器翻译、信息检索。
  • 基金资助:
    国家自然科学基金资助项目(61762056);国家重点研发计划项目(2018YFC0830105);云南省高新技术产业专项(201606);云南省重大科技专项计划项目(202002AD080001-5);云南省基础研究计划项目(202001AS070014)

Event detection without trigger words incorporating syntactic information

Cui WANG1,2, Yafei ZHANG1,2(), Junjun GUO1,2, Shengxiang GAO1,2, Zhengtao YU1,2   

  1. 1.Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming Yunnan 650504,China
    2.Yunnan Key Laboratory of Artificial Intelligence (Kunming University of Science and Technology),Kunming Yunnan 650504,China
  • Received:2021-05-12 Revised:2021-06-17 Accepted:2021-07-04 Online:2021-12-28 Published:2021-12-10
  • Contact: Yafei ZHANG
  • About author:WANG Cui, born in 1997, M. S. candidate. Her research interests include natural language processing, event detection.
    GUO Junjun, Born in 1987, Ph. D., associate professor. His research interests include natural language processing, machine translation.
    GAO Shengxiang, born in 1977, Ph. D., associate professor. Her research interests include natural language processing, machine translation.
    YU Zhengtao, born in 1970, Ph. D., professor. His research interests include natural language processing, machine translation, information retrieval.
  • Supported by:
    the National Natural Science Foundation of China(61762056);the National Key Research and Development Program of China(2018YFC0830105);the Special Project of Yunnan High-Tech Industry(201606);the Yunnan Major Science and Technology Program(202002AD080001-5);the Yunnan Basic Research Program(202001AS070014)

摘要:

事件检测(ED)是信息抽取领域中最重要的任务之一,旨在识别文本中特定事件类型的实例。现有的ED方法通常采用邻接矩阵来表示句法依存关系,然而邻接矩阵往往需要借助图卷积网络(GCN)进行编码来获取句法信息,由此增加了模型的复杂度。为此,提出了融合句法信息的无触发词事件检测方法。通过将依赖父词及其上下文转换为位置标记向量,并在模型源端以无参数的方式融入依赖子词的单词嵌入来加强上下文的语义表征,而不需要经过GCN进行编码;此外,针对触发词的标注费时费力的问题,设计了基于多头注意力机制的类型感知器,以对句子中潜在的触发词进行建模,实现无触发词的事件检测。为了验证所提方法的性能,在ACE2005数据集以及低资源越南语数据集上进行了实验。其中,在ACE2005数据集上与图变换网络事件检测(GTN-ED)方法相比,所提方法的F1值提升了3.7%;在越南语数据集上,与二分类的方法类型感知偏差注意机制神经网络(TBNNAM)相比,所提方法的F1值提升了9%。结果表明,通过在Transformer中融入句法信息能有效地连接句子中分散的事件信息来提高事件检测的准确性。

关键词: 事件检测, 句法信息, 无参数, 无触发词, 类型感知器

Abstract:

Event Detection (ED) is one of the most important tasks in the field of information extraction, aiming to identify instances of specific event types in text. Existing ED methods usually use adjacency matrix to express syntactic dependencies, however, the adjacency matrix often needs to be encoded with Graph Convolutional Network (GCN) to obtain syntactic information, which increases the complexity of the model. Therefore, an event detection method without trigger words incorporating syntactic information was proposed. After converting the dependent parent word and its context into a position marker vector, the word embedding of dependent sub-word was incorporated at the source end of the model in a parameter-free manner to strengthen the semantic representation of the context, without the need of GCN for encoding. In addition, for the time-consuming and laborious labeling of trigger words, a type perceptron based on the multi-head attention mechanism was designed, which was able to model the potential trigger words in the sentence to complete the event detection without trigger words. In order to verify the performance of the proposed method, experiments were conducted on the ACE2005 dataset and the low-resource Vietnamese dataset. Compared with the Event Detection Using Graph Transformer Network (GTN-ED) method, the F1-score of the proposed method was increased by 3.7% on the ACE2005 dataset; compared with the binary classification method Type-aware Bias Neural Network with Attention Mechanisms (TBNNAM), the F1-score of the proposed method was increased by 9% on the Vietnamese dataset. The results show that the integration of syntactic information into Transformer can effectively connect the scattered event information in the sentence to improve the accuracy of event detection.

Key words: Event Detection (ED), syntactic information, parameter-free, no trigger word, type perceptron

中图分类号: