《计算机应用》唯一官方网站 ›› 2025, Vol. 45 ›› Issue (7): 2229-2236.DOI: 10.11772/j.issn.1001-9081.2024070980

• 人工智能 • 上一篇    下一篇

联合边界生成的多目标学习的嵌套命名实体识别

徐章杰1,2,3, 陈艳平1,2,3(), 扈应1,2,3, 黄瑞章1,2,3, 秦永彬1,2,3   

  1. 1.文本计算与认知智能教育部工程研究中心(贵州大学),贵阳 550025
    2.公共大数据国家重点实验室(贵州大学),贵阳 550025
    3.贵州大学 计算机科学与技术学院,贵阳 550025
  • 收稿日期:2024-07-10 修回日期:2024-10-08 接受日期:2024-10-09 发布日期:2025-07-10 出版日期:2025-07-10
  • 通讯作者: 陈艳平
  • 作者简介:徐章杰(2000—),女,贵州贵阳人,硕士研究生,CCF学生会员,主要研究方向:自然语言处理、信息抽取
    陈艳平(1980—),男,贵州长顺人,教授,博士,CCF会员,主要研究方向:人工智能、自然语言处理 ypench@gmail.com
    扈应(1996—),男,重庆人,博士研究生,主要研究方向:自然语言处理
    黄瑞章(1979—),女,天津人,教授,博士,CCF会员,主要研究方向:数据融合分析、文本挖掘、网络挖掘、知识发现、机器学习
    秦永彬(1980—),男,山东烟台人,教授,博士,CCF高级会员,主要研究方向:大数据治理与应用、多源数据融合、智能计算、机器学习、算法设计。
  • 基金资助:
    黔科合重大专项([2024]003);国家重点研发计划项目(2023YFC3304500);国家自然科学基金资助项目(62166007)

Nested named entity recognition combined with boundary generation by multi-objective learning

Zhangjie XU1,2,3, Yanping CHEN1,2,3(), Ying HU1,2,3, Ruizhang HUANG1,2,3, Yongbin QIN1,2,3   

  1. 1.Engineering Research Center of Ministry of Education for Text Computing and Cognitive Intelligence (Guizhou University),Guiyang Guizhou 550025,China
    2.State Key Laboratory of Public Big Data (Guizhou University),Guiyang Guizhou 550025,China
    3.College of Computer Science and Technology,Guizhou University,Guiyang Guizhou 550025,China
  • Received:2024-07-10 Revised:2024-10-08 Accepted:2024-10-09 Online:2025-07-10 Published:2025-07-10
  • Contact: Yanping CHEN
  • About author:XU Zhangjie, born in 2000, M. S. candidate. Her research interests include natural language processing, information extraction.
    CHEN Yanping, born in 1980, Ph. D., professor. His research interests include artificial intelligence, natural language processing.
    HU Ying, born in 1996, Ph. D. candidate. His research interests include natural language processing.
    HUANG Ruizhang, born in 1979, Ph. D., professor. Her research interests include data fusion analysis, text mining, network mining, knowledge discovery, machine learning.
    QIN Yongbin, born in 1980, Ph. D., professor. His research interests include big data governance and application, multi-source data fusion, intelligent computing, machine learning, algorithm design.
  • Supported by:
    Major Science and Technology Project of Guizhou Province (Qiankehe([2024]003);National Key Research and Development Program of China(2023YFC3304500);National Natural Science Foundation of China(62166007)

摘要:

命名实体识别(NER)旨在从非结构化文本中识别预定义的实体类型。基于跨度的NER方法通过枚举所有可能的跨度进行分类,然而文本中相邻的跨度共享上下文语义,会导致跨度之间的边界语义信息模糊,从而使模型难以获取跨度间的依赖信息。针对跨度间边界语义信息模糊的问题,提出一种联合边界生成的多目标学习NER模型。该模型通过联合NER任务和边界生成任务,以多目标学习的方式进行共同训练。其中:使用边界生成任务作为辅助任务引导模型网络关注跨度的边界信息,以增强跨度的边界语义,进而提升NER的性能。在ACE2004、ACE2005和GENIA数据集上进行测试,所提模型的F1值分别达到了87.83%、86.90%和81.65%,实验结果充分验证了该模型在不同数据集上的有效性,也进一步验证了该模型在命名实体识别任务中的优越性能。

关键词: 命名实体识别, 跨度分类, 多目标学习, 边界生成, 神经网络

Abstract:

Named Entity Recognition (NER) aims to identify predefined entity types from unstructured text. Span-based NER methods recognize entities through enumerating all the spans. However, adjacent spans in the text share contextual semantics, which leads to semantic information ambiguity among span boundaries, thus making it difficult for models to capture dependency information among spans. To address the issue of semantic information ambiguity among span boundaries, a multi-objective learning NER model combined with boundary generation was proposed. The model was trained through a multi-objective learning approach jointly through combining NER task with boundary generation task. Among which, the boundary generation task was used as an auxiliary task to guide the model network to focus on boundary information of the spans, thus improving the performance of NER. Tests conducted on the ACE2004, ACE2005, and GENIA datasets show that the proposed model achieves F1 scores of 87.83%, 86.90%, and 81.65%, respectively. Experimental results fully validate the effectiveness of the model on different datasets and also further confirm its superior performance in named entity recognition tasks.

Key words: Named Entity Recognition (NER), span classification, multi-objective learning, boundary generation, neural network

中图分类号: