《计算机应用》唯一官方网站 ›› 2025, Vol. 45 ›› Issue (7): 2229-2236.DOI: 10.11772/j.issn.1001-9081.2024070980
徐章杰1,2,3, 陈艳平1,2,3(), 扈应1,2,3, 黄瑞章1,2,3, 秦永彬1,2,3
收稿日期:
2024-07-10
修回日期:
2024-10-08
接受日期:
2024-10-09
发布日期:
2025-07-10
出版日期:
2025-07-10
通讯作者:
陈艳平
作者简介:
徐章杰(2000—),女,贵州贵阳人,硕士研究生,CCF学生会员,主要研究方向:自然语言处理、信息抽取基金资助:
Zhangjie XU1,2,3, Yanping CHEN1,2,3(), Ying HU1,2,3, Ruizhang HUANG1,2,3, Yongbin QIN1,2,3
Received:
2024-07-10
Revised:
2024-10-08
Accepted:
2024-10-09
Online:
2025-07-10
Published:
2025-07-10
Contact:
Yanping CHEN
About author:
XU Zhangjie, born in 2000, M. S. candidate. Her research interests include natural language processing, information extraction.Supported by:
摘要:
命名实体识别(NER)旨在从非结构化文本中识别预定义的实体类型。基于跨度的NER方法通过枚举所有可能的跨度进行分类,然而文本中相邻的跨度共享上下文语义,会导致跨度之间的边界语义信息模糊,从而使模型难以获取跨度间的依赖信息。针对跨度间边界语义信息模糊的问题,提出一种联合边界生成的多目标学习NER模型。该模型通过联合NER任务和边界生成任务,以多目标学习的方式进行共同训练。其中:使用边界生成任务作为辅助任务引导模型网络关注跨度的边界信息,以增强跨度的边界语义,进而提升NER的性能。在ACE2004、ACE2005和GENIA数据集上进行测试,所提模型的F1值分别达到了87.83%、86.90%和81.65%,实验结果充分验证了该模型在不同数据集上的有效性,也进一步验证了该模型在命名实体识别任务中的优越性能。
中图分类号:
徐章杰, 陈艳平, 扈应, 黄瑞章, 秦永彬. 联合边界生成的多目标学习的嵌套命名实体识别[J]. 计算机应用, 2025, 45(7): 2229-2236.
Zhangjie XU, Yanping CHEN, Ying HU, Ruizhang HUANG, Yongbin QIN. Nested named entity recognition combined with boundary generation by multi-objective learning[J]. Journal of Computer Applications, 2025, 45(7): 2229-2236.
数据集 | 句子数量 | 句子平均长度 | 实体数 | 实体平均长度 | 嵌套实体数 | 嵌套实体比例/% | |
---|---|---|---|---|---|---|---|
训练集 | ACE2004 | 6 200 | 23.50 | 22 204 | 2.63 | 10 149 | 45.71 |
ACE2005 | 7 194 | 19.21 | 24 441 | 2.42 | 9 389 | 38.41 | |
GENIA | 15 023 | 25.27 | 45 144 | 1.95 | 7 997 | 17.71 | |
验证集 | ACE2004 | 745 | 23.02 | 2 514 | 2.67 | 1 092 | 46.69 |
ACE2005 | 969 | 18.93 | 3 200 | 2.26 | 1 112 | 34.75 | |
GENIA | 1 669 | 26.01 | 5 365 | 1.97 | 1 067 | 19.88 | |
测试集 | ACE2004 | 812 | 23.05 | 3 035 | 2.68 | 1 417 | 45.61 |
ACE2005 | 1 047 | 17.20 | 2 993 | 2.40 | 1 118 | 37.35 | |
GENIA | 1 854 | 25.98 | 5 506 | 2.08 | 1 199 | 21.77 |
表1 数据集的统计信息
Tab. 1 Dataset statistics
数据集 | 句子数量 | 句子平均长度 | 实体数 | 实体平均长度 | 嵌套实体数 | 嵌套实体比例/% | |
---|---|---|---|---|---|---|---|
训练集 | ACE2004 | 6 200 | 23.50 | 22 204 | 2.63 | 10 149 | 45.71 |
ACE2005 | 7 194 | 19.21 | 24 441 | 2.42 | 9 389 | 38.41 | |
GENIA | 15 023 | 25.27 | 45 144 | 1.95 | 7 997 | 17.71 | |
验证集 | ACE2004 | 745 | 23.02 | 2 514 | 2.67 | 1 092 | 46.69 |
ACE2005 | 969 | 18.93 | 3 200 | 2.26 | 1 112 | 34.75 | |
GENIA | 1 669 | 26.01 | 5 365 | 1.97 | 1 067 | 19.88 | |
测试集 | ACE2004 | 812 | 23.05 | 3 035 | 2.68 | 1 417 | 45.61 |
ACE2005 | 1 047 | 17.20 | 2 993 | 2.40 | 1 118 | 37.35 | |
GENIA | 1 854 | 25.98 | 5 506 | 2.08 | 1 199 | 21.77 |
参数 | 值 | 参数 | 值 |
---|---|---|---|
批次大小 | 8 | 随机失活 | 0.5 |
训练轮数 | 50 | 平衡因子λ | 1×10-3 |
学习率 | 1×10-5 | 卷积膨胀率 | {1,2,3,4} |
表2 参数设置
Tab. 2 Parameters setting
参数 | 值 | 参数 | 值 |
---|---|---|---|
批次大小 | 8 | 随机失活 | 0.5 |
训练轮数 | 50 | 平衡因子λ | 1×10-3 |
学习率 | 1×10-5 | 卷积膨胀率 | {1,2,3,4} |
模型 | ACE2004 | ACE2005 | GENIA | ||||||
---|---|---|---|---|---|---|---|---|---|
P | R | F1 | P | R | F1 | P | R | F1 | |
BoningKnife[ | 85.98 | 86.86 | 86.41 | 84.77 | 86.16 | 85.46 | — | — | — |
Biaffine[ | 87.30 | 86.00 | 86.70 | 85.20 | 85.60 | 85.40 | 81.80 | 79.30 | 80.50 |
Local and Label[ | 87.44 | 87.38 | 87.41 | 86.09 | 87.27 | 86.67 | 80.19 | 80.89 | 80.54 |
Triaffine[ | 87.13 | 87.68 | 87.40 | 86.70 | 86.94 | 86.82 | 80.42 | 82.06 | 81.23 |
W2NER[ | 87.33 | 87.71 | 87.52 | 85.03 | 88.62 | 86.79 | 83.10 | 79.76 | 81.39 |
Local Future Boost[ | 86.96 | 86.36 | 86.66 | 84.94 | 86.73 | 85.83 | 82.35 | 80.33 | 81.33 |
Debiasing[ | 87.64 | 87.61 | 87.63 | 85.01 | 87.47 | 86.22 | 79.51 | 79.48 | 79.49 |
Biaffine and Triaffine[ | 87.91 | 87.41 | 87.66 | 85.80 | 87.95 | 86.86 | 83.02 | 78.88 | 80.90 |
本文模型 | 87.33 | 88.34 | 87.83 | 85.37 | 88.48 | 86.90 | 81.26 | 82.05 | 81.65 |
表3 数据集上各模型的结果 (%)
Tab. 3 Different model results on datasets
模型 | ACE2004 | ACE2005 | GENIA | ||||||
---|---|---|---|---|---|---|---|---|---|
P | R | F1 | P | R | F1 | P | R | F1 | |
BoningKnife[ | 85.98 | 86.86 | 86.41 | 84.77 | 86.16 | 85.46 | — | — | — |
Biaffine[ | 87.30 | 86.00 | 86.70 | 85.20 | 85.60 | 85.40 | 81.80 | 79.30 | 80.50 |
Local and Label[ | 87.44 | 87.38 | 87.41 | 86.09 | 87.27 | 86.67 | 80.19 | 80.89 | 80.54 |
Triaffine[ | 87.13 | 87.68 | 87.40 | 86.70 | 86.94 | 86.82 | 80.42 | 82.06 | 81.23 |
W2NER[ | 87.33 | 87.71 | 87.52 | 85.03 | 88.62 | 86.79 | 83.10 | 79.76 | 81.39 |
Local Future Boost[ | 86.96 | 86.36 | 86.66 | 84.94 | 86.73 | 85.83 | 82.35 | 80.33 | 81.33 |
Debiasing[ | 87.64 | 87.61 | 87.63 | 85.01 | 87.47 | 86.22 | 79.51 | 79.48 | 79.49 |
Biaffine and Triaffine[ | 87.91 | 87.41 | 87.66 | 85.80 | 87.95 | 86.86 | 83.02 | 78.88 | 80.90 |
本文模型 | 87.33 | 88.34 | 87.83 | 85.37 | 88.48 | 86.90 | 81.26 | 82.05 | 81.65 |
GENIA | |||
---|---|---|---|
P | R | F1 | |
10-1 | 80.09 | 80.00 | 80.05 |
10-2 | 81.39 | 81.11 | 81.25 |
10-3 | 82.52 | 80.95 | 81.73 |
10-4 | 81.67 | 80.82 | 81.24 |
表4 平衡因子λ的选取 (%)
Tab. 4 Selection of balance factor λ
GENIA | |||
---|---|---|---|
P | R | F1 | |
10-1 | 80.09 | 80.00 | 80.05 |
10-2 | 81.39 | 81.11 | 81.25 |
10-3 | 82.52 | 80.95 | 81.73 |
10-4 | 81.67 | 80.82 | 81.24 |
模块 | F1 | ||
---|---|---|---|
ACE2004 | ACE2005 | GENIA | |
完整模型 | 87.83 | 86.90 | 81.65 |
-边界生成 | 87.67 | 86.56 | 81.48 |
-空洞卷积 | 87.65 | 86.74 | 81.57 |
-Biaffine | 87.58 | 86.73 | 81.51 |
表5 去掉各模块后的性能 (%)
Tab. 5 Performance after removing each module
模块 | F1 | ||
---|---|---|---|
ACE2004 | ACE2005 | GENIA | |
完整模型 | 87.83 | 86.90 | 81.65 |
-边界生成 | 87.67 | 86.56 | 81.48 |
-空洞卷积 | 87.65 | 86.74 | 81.57 |
-Biaffine | 87.58 | 86.73 | 81.51 |
[1] | 王颖洁,张程烨,白凤波,等.中文命名实体识别研究综述[J].计算机科学与探索,2023, 17(2): 324-341. |
WANG Y J, ZHANG C Y, BAI F B, et al. Review of Chinese named entity recognition research [J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(2): 324-341. | |
[2] | GUO J, XU G, CHENG X, et al. Named entity recognition in query [C]// Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2009: 267-274. |
[3] | PETKOVA D, CROFT W B. Proximity-based document representation for named entity retrieval [C]// Proceedings of the 16th ACM Conference on Information and Knowledge Management. New York: ACM, 2007: 731-740. |
[4] | MOLLÁ D, VAN ZAANEN M, SMITH D. Named entity recognition for question answering [C]// Proceedings of the Australasian Language Technology Association Workshop 2006. [S.l.]: Australasian Language Technology Association, 2006: 51-58. |
[5] | ETZIONI O, CAFARELLA M, DOWNEY D, et al. Unsupervised named-entity extraction from the Web: an experimental study [J]. Artificial Intelligence, 2005, 165(1): 91-134. |
[6] | ZHANG Z, HAN X, LIU Z, et al. ERNIE: enhanced language representation with informative entities [C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2019: 1441-1451. |
[7] | CHENG P, ERK K. Attending to entities for better text understanding [C]// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2020: 7554-7561. |
[8] | BABYCH B, HARTLEY A. Improving machine translation quality with automatic named entity recognition [C]// Proceedings of the 7th International EAMT workshop on MT and other language technology tools, Improving MT through other language technology tools, Resource and tools for building MT at EACL. Stroudsburg: ACL, 2003: 1-8. |
[9] | 蔡宇翔,骆妲,甘洋镭,等.基于跨度边界感知的嵌套命名实体识别[J].软件学报,2024, 35(11): 5149-5162. |
CAI Y X, LUO D, GAN Y L, et al. Nested named entity recognition based on span boundary perception [J]. Journal of Software, 2024, 35(11): 5149-5162. | |
[10] | 耿汝山,陈艳平,唐瑞雪,等.跨度语义增强的命名实体识别方法[J].西安交通大学学报,2022, 56(7): 118-126. |
GENG R S, CHEN Y P, TANG R X, et al. Named entity recognition based on span semantic enhancement [J]. Journal of Xi’an Jiaotong University, 2022, 56(7): 118-126. | |
[11] | LU W, ROTH D. Joint mention extraction and classification with mention hypergraphs [C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2015: 857-867. |
[12] | MUIS A O, LU W. Labeling gaps between words: recognizing overlapping mentions with mention separators [C]// Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2017: 2608-2618. |
[13] | KATIYAR A, CARDIE C. Nested named entity recognition revisited [C]// Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers). Stroudsburg: ACL, 2018: 861-871. |
[14] | YAN Y, CAI B, SONG S. Nested named entity recognition as building local hypergraphs [C]// Proceedings of the 37th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2023: 13878-13886. |
[15] | STRAKOVÁ J, STRAKA M, HAJIC J. Neural architectures for nested NER through linearization [C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2019: 5326-5331. |
[16] | YAN H, GUI T, DAI J, et al. A unified generative framework for various NER subtasks [C]// Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Stroudsburg: ACL, 2021: 5808-5822. |
[17] | XIA Y, ZHAO Y, WU W, et al. Debiasing generative named entity recognition by calibrating sequence likelihood [C]// Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Stroudsburg: ACL, 2023: 1137-1148. |
[18] | JU M, MIWA M, ANANIADOU S. A neural layered model for nested named entity recognition [C]// Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers). Stroudsburg: ACL, 2018: 1446-1459. |
[19] | WANG J, SHOU L, CHEN K, et al. Pyramid: a layered model for nested named entity recognition [C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2020: 5918-5928. |
[20] | ROJAS M, BRAVO-MARQUEZ F, DUNSTAN J. Simple yet powerful: an overlooked architecture for nested named entity recognition [C]// Proceedings of the 29th International Conference on Computational Linguistics. [S.l.]: International Committee on Computational Linguistics, 2022: 2108-2117. |
[21] | SOHRAB M G, MIWA M. Deep exhaustive model for nested named entity recognition [C]// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2018: 2843-2849. |
[22] | ZHENG C, CAI Y, XU J, et al. A boundary-aware neural model for nested named entity recognition [C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Stroudsburg: ACL, 2019: 357-366. |
[23] | TAN C, QIU W, CHEN M, et al. Boundary enhanced neural span classification for nested named entity recognition [C]// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2020: 9016-9023. |
[24] | JIANG H, WANG G, CHEN W, et al. BoningKnife: joint entity mention detection and typing for nested NER via prior boundary knowledge [EB/OL]. [2024-05-10]. . |
[25] | SHEN Y, MA X, TAN Z, et al. Locate and label: a two-stage identifier for nested named entity recognition [C]// Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Stroudsburg: ACL, 2021: 2782-2794. |
[26] | YU J, BOHNET B, POESIO M. Named entity recognition as dependency parsing [C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2020: 6470-6476. |
[27] | LI J, FEI H, LIU J, et al. Unified named entity recognition as word-word relation classification [C]// Proceedings of the 36th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2022: 10965-10973. |
[28] | YUAN Z, TAN C, HUANG S, et al. Fusing heterogeneous factors with triaffine mechanism for nested named entity recognition [C]// Findings of the Association for Computational Linguistics: ACL 2022. Stroudsburg: ACL, 2022: 3174-3186. |
[29] | DENG J, LIU J, MA X, et al. Local feature enhancement for nested entity recognition using a convolutional block attention module [J]. Applied Sciences, 2023, 13(16): No.9200. |
[30] | GUO Y, TANG T, SUN S, et al. Nested entity recognition fusing span relative position and region information [J]. Electronics, 2023, 12(11): No.2483. |
[31] | DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding [C]// Proceedings of the 2019 North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long and Short Papers). Stroudsburg: ACL, 2019: 4171-4186. |
[32] | LEWIS M, LIU Y, GOYAL N, et al. BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension [C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2020: 7871-7880. |
[33] | DODDINGTON G, MITCHELL A, PRZYBOCKI M, RAMSHAW L, STRASSEL S, WEISCHEDEL R. The Automatic Content Extraction (ACE) program-tasks, data, and evaluation [C]// Proceedings of the 4th International Conference on Language Resources and Evaluation Conference. Paris: European Language Resources Association, 2004: 837-840. |
[34] | WALKER C, STRASSEL S, MEDERO J, MAEDA K. ACE 2005 multilingual training corpus [DS/OL]. [2024-05-15]. . |
[35] | KIM J D, OHTA T, TATEISI Y, et al. GENIA corpus — a semantically annotated corpus for bio-text mining [J]. Bioinformatics, 2003, 19(S1): i180-i182. |
[36] | LOSHCHILOV I, HUTTER F. Fixing weight decay regularization in Adam [EB/OL]. [2024-06-10]. . |
[1] | 向尔康, 黄荣, 董爱华. 开放生成与特征优化的开集识别方法[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2195-2202. |
[2] | 郭书君, 任卫军, 陈倩倩, 游广飞. 基于聚类多变量时间序列模型的交通状态实时预测[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2253-2261. |
[3] | 陈丹阳, 张长伦. 多尺度去相关的图卷积网络模型[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2180-2187. |
[4] | 张悦岚, 苏静, 赵航宇, 杨白利. 基于知识感知与交互的多视图蒸馏推荐算法[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2211-2220. |
[5] | 张英俊, 闫薇薇, 谢斌红, 张睿, 陆望东. 梯度区分与特征范数驱动的开放世界目标检测[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2203-2210. |
[6] | 齐巧玲, 王啸啸, 张茜茜, 汪鹏, 董永峰. 基于元学习的标签噪声自适应学习算法[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2113-2122. |
[7] | 王慧斌, 胡展傲, 胡节, 徐袁伟, 文博. 基于分段注意力机制的时间序列预测模型[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2262-2268. |
[8] | 张立孝, 马垚, 杨玉丽, 于丹, 陈永乐. 基于命名实体识别的大规模物联网二进制组件识别[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2288-2295. |
[9] | 陶永鹏, 柏诗淇, 周正文. 基于卷积和Transformer神经网络架构搜索的脑胶质瘤多组织分割网络[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2378-2386. |
[10] | 梁辰, 王奕森, 魏强, 杜江. 基于Tsransformer-GCN的源代码漏洞检测方法[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2296-2303. |
[11] | 张子墨, 赵雪专. 多尺度稀疏图引导的视觉图神经网络[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2188-2194. |
[12] | 姜超英, 李倩, 刘宁, 刘磊, 崔立真. 基于图对比学习的再入院预测模型[J]. 《计算机应用》唯一官方网站, 2025, 45(6): 1784-1792. |
[13] | 吴宗航, 张东, 李冠宇. 基于联合自监督学习的多模态融合推荐算法[J]. 《计算机应用》唯一官方网站, 2025, 45(6): 1858-1868. |
[14] | 花天辰, 马晓宁, 智慧. 基于浅层人工神经网络的可移植执行恶意软件静态检测模型[J]. 《计算机应用》唯一官方网站, 2025, 45(6): 1911-1921. |
[15] | 龙雨菲, 牟宇辰, 刘晔. 基于张量化图卷积网络和对比学习的多源数据表示学习模型[J]. 《计算机应用》唯一官方网站, 2025, 45(5): 1372-1378. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||