Journal of Computer Applications

    Next Articles

Nested named entity recognition combined with boundary generation by multi-objective learning

  

  • Received:2024-07-10 Revised:2024-10-08 Online:2024-11-19 Published:2024-11-19

联合边界生成的多目标学习嵌套命名实体识别

徐章杰1,陈艳平2,扈应2,黄瑞章1,秦永彬1   

  1. 1. 贵州大学
    2. 贵州大学计算机科学与技术学院
  • 通讯作者: 徐章杰
  • 基金资助:
    基于位置回归的嵌套语义识别机制研究

Abstract: Named entity recognition aims to identify predefined entity types from unstructured text. Span-based named entity recognition methods recognize entities from among all enumerated spans. However, due to adjacent spans in the text share contextual semantics, this leads to semantic ambiguity at span boundaries and makes it difficult for models to capture semantic dependencies between spans. To address the issue of ambiguous boundary semantics between spans, a multi-objective learning named entity recognition model with boundary generation was proposed. The model was trained through a multi-objective learning approach that jointly combined named entity recognition task with boundary generation task. The boundary generation task served as an auxiliary task. It helped the model network focus on the boundary information, thus enhancing the semantic clarity of boundaries and improving the performance of named entity recognition. To validate the effectiveness of the model, tests were conducted on the ACE04, ACE05, and Genia datasets and achieved F1 scores of 87.83%, 86.90%, and 81.65%, respectively. The experimental results demonstrate that the proposed method improves the accuracy of named entity recognition.

Key words: named entity recognition, span classification, multi-objective learning, boundary generation, neural networks

摘要: 命名实体识别旨在从非结构化文本中识别预定义的实体类型。基于跨度的命名实体识别方法通过枚举所有可能的跨度进行分类,然而由于文本中相邻的跨度共享上下文语义,导致跨度之间的边界语义信息模糊,从而使模型难以获取跨度间的依赖信息。针对跨度间边界语义信息模糊的问题,提出一种联合边界生成的多目标学习命名实体识别模型,该模型通过联合命名实体识别任务和边界生成任务,以多目标学习的方式共同训练。其中边界生成任务作为辅助任务引导模型网络关注跨度的边界信息,以增强跨度的边界语义,进而提升命名实体识别的性能。为验证模型的有效性,在ACE04、ACE05和Genia数据集上进行测试,性能F1值分别达到87.83%、86.90%和81.65%,实验结果表明所提方法能够提高命名实体识别准确性。

关键词: 命名实体识别, 跨度分类, 多目标学习, 边界生成, 神经网络

CLC Number: