Journal of Computer Applications ›› 2021, Vol. 41 ›› Issue (1): 29-35.DOI: 10.11772/j.issn.1001-9081.2020060934

Special Issue: 第八届中国数据挖掘会议(CCDM 2020)

• China Conference on Data Mining 2020 (CCDM 2020) • Previous Articles     Next Articles

Joint extraction of entities and relations based on relation-adaptive decoding

DING Xiangguo, SANG Jitao   

  1. School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China
  • Received:2020-07-02 Revised:2020-07-24 Online:2021-01-10 Published:2020-08-21
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (61832002).

基于关系自适应解码的实体关系联合抽取

丁相国, 桑基韬   

  1. 北京交通大学 计算机与信息技术学院, 北京 100044
  • 通讯作者: 桑基韬
  • 作者简介:丁相国(1995-),男,山东潍坊人,硕士研究生,主要研究方向:信息抽取、知识图谱;桑基韬(1985-),男,山东烟台人,教授,博士,CCF会员,主要研究方向:多媒体计算、数据挖掘。
  • 基金资助:
    国家自然科学基金资助项目(61832002)。

Abstract: The model based on encoder-decoder for joint extraction of entities and relations solve the error propagation problem of the pipeline model. However, the previous model based on encoder-decoder has two problems:the one is that entities and relations are generated in the decoding stage at the same time, so that the mapping of the same semantic space reduces the extraction performance because entities and relations are two different types, the other is that the interactive information between different relations is never considered. Aiming at these two problems, a relation-adaptive decoding model for joint extraction of entities and relations was proposed. In the proposed model, the joint extraction task of entities and relations was converted into the generation task of entity pairs corresponding relations. Firstly, based on encoder-decoder, different relations were divided and ruled, and based on different relations, the entity pairs corresponding to the relations were output adaptively, making the decoding stage focus on the generation of entities. Then, the parameters of one model were shared between different relations, so that the correlation information between different relations was able to be utilized. In the experiment, the proposed model had the F1 scores increased by 2.5 percentage points and 2.2 percentage points respectively compared to the state-of-the-art model on two versions of New York Times (NYT) public dataset. Experimental results show that the proposed model can effectively improve the joint extraction ability of entities and relations through the relation-adaptive decoding.

Key words: relation-adaptive, share, encoder-decoder, entities and relations, joint extraction

摘要: 基于编码器-解码器的实体关系联合抽取模型解决了流水线模型存在的误差传递问题,但是以往基于编码器-解码器的模型还是存在两点问题:一是在解码阶段同时生成实体和关系,而两者是不同的对象,使得同一语义空间的映射降低了抽取效果;二是没有考虑不同关系之间的交互信息。针对这两点问题,提出了关系自适应解码模型。所提模型将实体关系联合抽取任务转化为对应关系的实体对生成任务。首先以编码器-解码器为基础,将不同关系分而治之;根据不同的关系来自适应输出相应关系的实体对,使解码阶段更专注于实体的生成。然后不同关系之间共享同一模型的参数,使不同关系之间的关联信息得以利用。所提模型在两种版本的纽约时报(NYT)公开数据集上进行了实验,其F1值比当前最先进的模型分别提升了2.5个百分点和2.2个百分点。实验结果表明,所提模型能够通过关系自适应解码的方式有效提升实体关系的联合抽取能力。

关键词: 关系自适应, 共享, 编码器-解码器, 实体关系, 联合抽取

CLC Number: