Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (5): 1438-1444.DOI: 10.11772/j.issn.1001-9081.2022040625

• Artificial intelligence • Previous Articles    

Joint entity and relation extraction based on contextual semantic enhancement

Jingsheng LEI1, Kaijun LA1, Shengying YANG1(), Yi WU2   

  1. 1.School of Information and Electronic Engineering,Zhejiang University of Science and Technology,Hangzhou Zhejiang 310023,China
    2.Zhejiang Cancer Hospital,Hangzhou Zhejiang 310022,China
  • Received:2022-05-07 Revised:2022-07-28 Accepted:2022-08-02 Online:2022-09-29 Published:2023-05-10
  • Contact: Shengying YANG
  • About author:LEI Jingsheng, born in 1966, Ph. D., professor. His research interests include data science and big data, machine learning, artificial intelligence.
    LA Kaijun, born in 1996, M. S. candidate. His research interests include natural language processing, relation extraction.
    YANG Shengying, born in 1989, Ph. D., lecturer. His research interests include machine learning, artificial intelligence.
    WU Yi, born in 1988, M. S. Her research interests include artificial intelligence, nursing dialogue.
  • Supported by:
    National Natural Science Foundation of China(61972357);Key Research and Development Program of Zhejiang Province of China(2019C03135);Medical and Health Science and Technology Program of Zhejiang Province(2022KY104)

基于上下文语义增强的实体关系联合抽取

雷景生1, 剌凯俊1, 杨胜英1(), 吴怡2   

  1. 1.浙江科技学院 信息与电子工程学院,杭州 310023
    2.浙江省肿瘤医院,杭州 310022
  • 通讯作者: 杨胜英
  • 作者简介:雷景生(1966—),男,陕西韩城人,教授,博士,主要研究方向:数据科学与大数据、机器学习、人工智能
    剌凯俊(1996—),男,山西介休人,硕士研究生,主要研究方向:自然语言处理、关系抽取
    杨胜英(1989—),男,山东东营人,讲师,博士,主要研究方向:机器学习、人工智能 syyang@zust.edu.cn
    吴怡(1988—),女,浙江杭州人,硕士,主要研究方向:人工智能、护理对话。
  • 基金资助:
    国家自然科学基金资助项目(61972357);浙江省重点研发计划项目(2019C03135);浙江省医药卫生科技计划项目(2022KY104)

Abstract:

Span-based joint extraction model shares the semantic representation of entity spans in entity and Relation Extraction (RE) tasks, which effectively reduces the cascade error caused by pipeline models. However, the existing models cannot adequately integrate contextual information into the representation of entities and relations. To solve this problem, a Joint Entity and Relation extraction model based on Contextual semantic Enhancement (JERCE) was proposed. Firstly, the semantic feature representations of sentence-level text and inter-entity text were obtained by contrastive learning method. Then, the representations were added into the representations of entity and relation to predict entities and relations jointly. Finally, the loss values of the two tasks were adjusted dynamically to optimize the overall performance of the joint model. In experiments on public datasets CoNLL04, ADE and ACE05, compared with Trigger-sense Memory Flow framework (TriMF), the proposed JERCE model has the F1 scores of entity recognition improved by 1.04, 0.13 and 2.12 percentage points respectively, and the F1 scores of RE increased by 1.19, 1.14 and 0.44 percentage points respectively. Experimental results show that the JERCE model can fully obtain semantic information in context.

Key words: Named Entity Recognition (NER), Relation Extraction (RE), contrastive learning, text span, weighted loss

摘要:

基于span的联合抽取模型在实体和关系抽取(RE)任务中共享实体span的语义表示,能有效降低流水线模型带来的级联误差,但现有模型无法充分地将上下文信息融入实体和关系的表示中。针对上述问题,提出一个基于上下文语义增强的实体关系联合抽取(JERCE)模型。首先通过对比学习的方法获取句子级文本和实体间文本的语义特征表示;然后,将该表示加入实体和关系的表示中,对实体关系进行联合预测;最后,动态调整两个任务的损失以使联合模型的整体性能最优化。在公共数据集CoNLL04、ADE和ACE05上进行实验,结果显示JERCE模型与触发器感知记忆流框架(TriMF)相比,实体识别F1值分别提升了1.04、0.13和2.12个百分点,RE的F1值则分别提升了1.19、1.14和0.44个百分点。实验结果表明,JERCE模型可以充分获取上下文中的语义信息。

关键词: 命名实体识别, 关系抽取, 对比学习, 文本span, 加权损失

CLC Number: