基于分解式模糊跨度的小样本命名实体识别

doi:10.11772/j.issn.1001-9081.2024050567

《计算机应用》唯一官方网站 ›› 2025, Vol. 45 ›› Issue (5): 1504-1510.DOI: 10.11772/j.issn.1001-9081.2024050567

• 人工智能 • 上一篇

基于分解式模糊跨度的小样本命名实体识别

曾碧卿¹(), 钟广彬¹, 温志庆²

^1.华南师范大学软件学院，广东佛山 528225
^2.季华实验室智能机器人工程研究中心，广东佛山 528200

收稿日期:2024-05-09 修回日期:2024-07-18 接受日期:2024-07-19 发布日期:2024-07-25 出版日期:2025-05-10
通讯作者: 曾碧卿
作者简介:曾碧卿（1969—），男，湖南衡南人，教授，博士，CCF杰出会员，主要研究方向：自然语言处理、人工智能
钟广彬（1998—），男，广东梅州人，硕士研究生，主要研究方向：自然语言处理、小样本命名实体识别
温志庆（1964—），男，山东招远人，教授，博士，主要研究方向：人工智能、机器视觉、光电技术、机器人。
基金资助:
国家自然科学基金资助项目(62076103);广东省基础与应用基础研究基金资助项目(2021A1515011171);广州市基础研究计划基础与应用基础研究项目(202102080282);佛山市重点领域科技攻关项目(2020001006807)

Few-shot named entity recognition based on decomposed fuzzy span

Biqing ZENG¹(), Guangbin ZHONG¹, James Zhiqing WEN²

^1.School of Software，South China Normal University，Foshan Guangdong 528225，China
^2.Intelligent Robot Engineering Research Center，JIHUA Laboratory，Foshan Guangdong 528200，China

Received:2024-05-09 Revised:2024-07-18 Accepted:2024-07-19 Online:2024-07-25 Published:2025-05-10
Contact: Biqing ZENG
About author:ZENG Biqing， born in 1969， Ph. D.， professor. His research interests include natural language processing， artificial intelligence.
ZHONG Guangbin， born in 1998， M. S. candidate. His research interests include natural language processing， few-shot named entity recognition.
WEN James Zhiqing， born in 1964， Ph. D.， professor. His research interests include artificial intelligence， machine vision， optoelectronic technology， robotics.
Supported by:
National Natural Science Foundation of China(62076103);Guangdong Basic and Applied Basic Research Foundation(2021A1515011171);Guangzhou Basic Research Program Basic and Applied Basic Research Project(202102080282);Foshan Key Science and Technology Research Project(2020001006807)

摘要/Abstract

摘要：

小样本命名实体识别（few-shot NER）旨在基于少量标记数据识别文本中的实体跨度和类型。近年来，基于跨度的度量学习虽然取得了不错的效果，但仍然存在2个问题：一是少量的候选跨度可能导致原型偏离群组的中心；二是与类别无关的跨度检测器可能会产生一些非实体跨度。为了解决以上问题，提出一种用于few-shot NER的融合模糊跨度的分解式模型DFSM（Decomposed Fuzzy Span Model）。在跨度检测阶段，为学习明确的实体边界信息且不受标记级别的标签依赖影响，DFSM采用全局边界矩阵检测候选跨度；而在跨度分类阶段，为增加可训练的每种实体类型的候选跨度数量，提出一种模糊跨度策略，以调整候选跨度的边界范围。同时，设计一种原型对比学习以优化基于跨度的语义表示空间。此外，为消除非实体噪声数据的干扰，引入原型边界学习以扩大非实体跨度与原型的距离。在Few-NERD和CrossNER数据集上的实验结果显示：与基线模型TadNER相比，在Few-NERD Inter设置中，DFSM的平均F1值提升了8.52个百分点，尤其是在Inter 10 way 5~10-shot设置中，DFSM的平均F1值提升了10.39个百分点，这表明DFSM对于细粒度实体类型具有更强的识别能力；与基线模型DecomMeta相比，在CrossNER 1-shot和5-shot设置中，DFSM的平均F1值分别提升了3.32和1.09个百分点，这表明DFSM在跨领域低资源场景下具有良好的泛化能力。

关键词: 命名实体识别, 小样本学习, 原型网络, 全局边界矩阵, 模糊跨度

Abstract:

Few-shot Named Entity Recognition （few-shot NER） aims to identify entity spans and their types in text based on limited labeled data. Although span-based metric learning has achieved promising results in recent years， two challenges remain： first， prototypes may be pulled away from cluster centers due to sparse candidate spans； second， some non-entity spans may be produced by span detectors that are irrelevant to the categories. To address these issues， a decomposed model integrating fuzzy span， namely DFSM （Decomposed Fuzzy Span Model）， was proposed for few-shot NER. In the span detection stage， a global boundary matrix was used to detect candidate spans， enabling the learning of explicit entity boundary information without dependency on labels at token level. In the span classification stage， a fuzzy span strategy was proposed to adjust the boundary ranges of candidate spans， thereby increasing the number of trainable candidate spans for each entity type. Meanwhile， a prototypical contrastive learning was designed to optimize the span-based semantic representation space. Besides， prototypical boundary learning was introduced to enlarge the distance between non-entity spans and prototypes， eliminating interference from non-entity noisy data. Experimental results on Few-NERD and CrossNER datasets show that： compared to the baseline model TadNER， DFSM achieves an average F1-score gain of 8.52 percentage points under the Few-NERD Inter setting， with a notable 10.39 percentage points improvement in the Inter 10-way 5 - 10-shot scenario， highlighting its enhanced capability for fine-grained entity recognition； compared to the baseline model DecomMeta， DFSM achieves F1-score improvements of 3.32 and 1.09 percentage points in CrossNER 1-shot and CrossNER 5-shot setting， respectively， demonstrating the good generalization ability of DFSM in cross-domain low-resource scenarios.

Key words: Named Entity Recognition (NER), few-shot learning, prototypical network, global boundary matrix, fuzzy span

中图分类号:

TP18

曾碧卿, 钟广彬, 温志庆. 基于分解式模糊跨度的小样本命名实体识别[J]. 计算机应用, 2025, 45(5): 1504-1510.

Biqing ZENG, Guangbin ZHONG, James Zhiqing WEN. Few-shot named entity recognition based on decomposed fuzzy span[J]. Journal of Computer Applications, 2025, 45(5): 1504-1510.

图/表 6

参考文献 34

1	HUANG Z， XU W， YU K. Bidirectional LSTM-CRF models for sequence tagging［EB/OL］. ［2023-09-01］..
2	SANTORO A， BARTUNOV S， BOTVINICK M， et al. Meta-learning with memory-augmented neural networks［C］// Proceedings of the 33rd International Conference on Machine Learning. New York： JMLR.org， 2016： 1842-1850.
3	MA X， HOVY E. End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF［C］// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics （Volume 1： Long Papers）. Stroudsburg： ACL， 2016： 1064-1074.
4	LAMPLE G， BALLESTEROS M， SUBRAMANIAN S， et al. Neural architectures for named entity recognition［C］// Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics： Human Language Technologies. Stroudsburg： ACL， 2016： 260-270.
5	PETERS M E， AMMAR W， BHAGAVATULA C， et al. Semi-supervised sequence tagging with bidirectional language models［C］// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics （Volume 1： Long Papers）. Stroudsburg： ACL， 2017： 1756-1765.
6	DING N， XU G， CHEN Y， et al. Few-NERD： a few-shot named entity recognition dataset［C］// Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing （Volume 1： Long Papers）. Stroudsburg： ACL， 2021： 3198-3213.
7	HUANG J， LI C， SUBUDHI K， et al. Few-shot named entity recognition： an empirical baseline study［C］// Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Stroudsburg： ACL， 2021： 10408-10423.
8	MA T， JIANG H， WU Q， et al. Decomposed meta-learning for few-shot named entity recognition［C］// Findings of the Association for Computational Linguistics： ACL 2022. Stroudsburg： ACL， 2022： 1584-1596.
9	DAS S S S， KATIYAR A， PASSONNEAU R J， et al. CONTaiNER： few-shot named entity recognition via contrastive learning［C］// Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics （Volume 1： Long Papers）. Stroudsburg： ACL， 2022： 6338-6353.
10	HOU Y， CHE W， LAI Y， et al. Few-shot slot tagging with collapsed dependency transfer and label-enhanced task-adaptive projection network［C］// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg： ACL， 2020： 1381-1393.
11	ZIYADI M， SUN Y， GOSWAMI A， et al. Example-based named entity recognition［EB/OL］. ［2023-09-01］..
12	FRITZLER A， LOGACHEVA V， KRETOV M. Few-shot classification in named entity recognition task［C］// Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing. New York： ACM， 2019： 993-1000.
13	MA J， BALLESTEROS M， DOSS S， et al. Label semantics for few shot named entity recognition［C］// Findings of the Association for Computational Linguistics： ACL 2022. Stroudsburg： ACL， 2022： 1956-1971.
14	SHEN Y， MA X， TAN Z， et al. Locate and label： a two-stage identifier for nested named entity recognition［C］// Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing （Volume 1： Long Papers）. Stroudsburg： ACL， 2021： 2782-2794.
15	WANG P， XU R， LIU T， et al. An enhanced span-based decomposition method for few-shot sequence labeling［C］// Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics： Human Language Technologies. Stroudsburg： ACL， 2022： 5012-5024.
16	WU S， SHEN Y， TAN Z， et al. Propose-and-Refine： a two-stage set prediction network for nested named entity recognition［C］// Proceedings of the 31st International Joint Conference on Artificial Intelligence. San Francisco： Morgan Kaufmann Publishers Inc.， 2022： 4418-4424.
17	KULKARNI V， MEHDAD Y， CHEVALIER T. Domain adaptation for named entity recognition in online media with word embeddings［EB/OL］. ［2023-11-01］..
18	FINN C， ABBEEL P， LEVINE S. Model-agnostic meta-learning for fast adaptation of deep networks［C］// Proceedings of the 34th International Conference on Machine Learning. New York： PMLR， 2017： 1126-1135.
19	VINYALS O， BLUNDELL C， LILLICRAP T， et al. Matching networks for one shot learning［C］// Proceedings of the 30th International Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2016： 3637-3645.
20	SNELL J， SWERSKY K， ZEMEL R. Prototypical networks for few-shot learning［C］// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2017： 4080-4090.
21	RAVI S， LAROCHELLE H. Optimization as a model for few-shot learning［EB/OL］. ［2023-09-21］..
22	SUNG F， YANG Y， ZHANG L， et al. Learning to compare： relation network for few-shot learning［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 1199-1208.
23	DEVLIN J， CHANG M W， LEE K， et al. BERT： pre-training of deep bidirectional Transformers for language understanding［C］// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics： Human Language Technologies， Volume 1 （Long and Short Papers）. Stroudsburg： ACL， 2019： 4171-4186.
24	VASWANI A， SHAZEER N， PARMAR N， et al. Attention is all you need［C］// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2017： 6000-6010.
25	WANG J， WANG C， TAN C， et al. SpanProto： a two-stage span-based prototypical network for few-shot named entity recognition［C］// Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. Stroudsburg： ACL， 2022： 3466-3476.
26	SANG E F， DE MEULDER F. Introduction to the CoNLL-2003 shared task： language-independent named entity recognition［C］// Proceedings of the 7th Conference on Natural Language Learning at HLT-NAACL 2003. Stroudsburg： ACL， 2003： 142-147.
27	ZELDES A. The GUM corpus： creating multilayer resources in the classroom［J］. Language Resources and Evaluation， 2017， 51（3）： 581-612.
28	DERCZYNSKI L， NICHOLS E， VAN ERP M， et al. Results of the WNUT2017 shared task on novel and emerging entity recognition［C］// Proceedings of the 3rd Workshop on Noisy User-generated Text. Stroudsburg： ACL， 2017： 140-147.
29	PRADHAN S， MOSCHITTI A， XUE N， et al. Towards robust linguistic analysis using OntoNotes［C］// Proceedings of the 17th Conference on Computational Natural Language Learning. Stroudsburg： ACL， 2013： 143-152.
30	YANG Y， KATIYAR A. Simple and effective few-shot named entity recognition with structured nearest neighbor learning［C］// Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. Stroudsburg： ACL， 2020： 6365-6375.
31	戚荣志，周俊宇，李水艳，等.基于细粒度原型网络的小样本命名实体识别方法［J］.软件学报，2024，35（10）：4751-4765.
	QI R Z， ZHOU J Y， LI S Y， et al. Few-shot named entity recognition based on fine-grained prototypical networks［J］. Journal of Software， 2024， 35（10）： 4751-4765.
32	LI Y， YU Y， QIAN T. Type-aware decomposed framework for few-shot named entity recognition［C］// Findings of the Association for Computational Linguistics： EMNLP 2023. Stroudsburg： ACL， 2023： 8911-8927.
33	LOSHCHILOV I， HUTTER F. Decoupled weight decay regularization［EB/OL］. ［2024-09-01］..
34	VAN DER MAATEN L， HINTON G. Visualizing data using t-SNE［J］. Journal of Machine Learning Research， 2008， 9： 2579-2605.

模型	Intra				Inter
	1~2-shot		5~10-shot		1~2-shot		5~10-shot
	5 way	10 way	5 way	10 way	5 way	10 way	5 way	10 way
ProtoBERT	23.45	19.76	41.93	34.61	44.44	39.09	58.80	53.97
NNShot	31.01	21.88	35.74	27.67	54.29	46.98	50.56	50.00
StructShot	35.92	25.38	38.83	26.39	57.33	49.46	57.16	49.39
CONTaiNER	40.43	33.84	53.70	47.49	55.95	48.35	61.83	57.12
FNFP	33.14	24.44	54.80	48.24	51.87	46.14	63.86	60.91
ESD	41.44	32.29	50.68	42.92	66.46	59.95	74.14	67.91
DecomMeta	52.04	43.50	63.23	56.84	68.77	63.26	71.62	68.32
TadNER	60.78	55.44	67.94	60.87	64.83	64.06	72.12	69.94
DFSM	55.84	46.82	72.89	64.46	74.57	67.99	82.15	80.33

模型	Intra				Inter
	1~2-shot		5~10-shot		1~2-shot		5~10-shot
	5 way	10 way	5 way	10 way	5 way	10 way	5 way	10 way
ProtoBERT	23.45	19.76	41.93	34.61	44.44	39.09	58.80	53.97
NNShot	31.01	21.88	35.74	27.67	54.29	46.98	50.56	50.00
StructShot	35.92	25.38	38.83	26.39	57.33	49.46	57.16	49.39
CONTaiNER	40.43	33.84	53.70	47.49	55.95	48.35	61.83	57.12
FNFP	33.14	24.44	54.80	48.24	51.87	46.14	63.86	60.91
ESD	41.44	32.29	50.68	42.92	66.46	59.95	74.14	67.91
DecomMeta	52.04	43.50	63.23	56.84	68.77	63.26	71.62	68.32
TadNER	60.78	55.44	67.94	60.87	64.83	64.06	72.12	69.94
DFSM	55.84	46.82	72.89	64.46	74.57	67.99	82.15	80.33

模型	1-shot				5-shot
模型	News	Wiki	Social	Mixed	News	Wiki	Social	Mixed
TransferBERT	4.75	0.57	2.71	3.46	15.36	3.62	11.08	35.49
SimBERT	19.22	6.91	5.18	13.99	32.01	10.63	8.20	21.14
Matching Network	19.50	4.73	17.23	15.06	19.85	5.58	6.61	8.08
ProtoBERT	32.49	3.89	10.68	6.67	50.06	9.54	17.26	13.59
FNFP	24.50	8.61	10.89	26.90	51.58	16.10	20.62	29.91
L-TapNet+CDT	44.30	12.04	20.80	15.17	45.35	11.65	23.30	20.95
DecomMeta	46.09	17.54	25.14	34.13	58.18	31.36	31.02	45.55
DFSM	50.91	28.80	21.65	34.80	61.36	36.97	26.20	45.92

模型	1-shot				5-shot
模型	News	Wiki	Social	Mixed	News	Wiki	Social	Mixed
TransferBERT	4.75	0.57	2.71	3.46	15.36	3.62	11.08	35.49
SimBERT	19.22	6.91	5.18	13.99	32.01	10.63	8.20	21.14
Matching Network	19.50	4.73	17.23	15.06	19.85	5.58	6.61	8.08
ProtoBERT	32.49	3.89	10.68	6.67	50.06	9.54	17.26	13.59
FNFP	24.50	8.61	10.89	26.90	51.58	16.10	20.62	29.91
L-TapNet+CDT	44.30	12.04	20.80	15.17	45.35	11.65	23.30	20.95
DecomMeta	46.09	17.54	25.14	34.13	58.18	31.36	31.02	45.55
DFSM	50.91	28.80	21.65	34.80	61.36	36.97	26.20	45.92

模块	Intra	Inter
DFSM	55.84	74.57
DFSM w/o. FS	53.78	72.82
DFSM w/o. PCL	51.12	70.99
DFSM w/o. PML	52.67	71.58

基于分解式模糊跨度的小样本命名实体识别

Few-shot named entity recognition based on decomposed fuzzy span

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 6

参考文献 34

相关文章 15

编辑推荐

Metrics

温度系数τ	F1/%
温度系数τ	Intra	Inter
τ = 0.05	72.89	81.13
τ = 0.10	72.62	81.45
τ = 0.20	72.28	81.81
τ = 0.40	72.06	82.15

[1]	严一钦, 罗川, 李天瑞, 陈红梅. 基于关系网络和Vision Transformer的跨域小样本分类模型[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1095-1103.
[2]	严雪文, 黄章进. 基于对比学习的小样本图像分类方法[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 383-391.
[3]	谢斌红, 高婉银, 陆望东, 张英俊, 张睿. 小样本相似性匹配特征增强的密集目标计数网络[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 403-410.
[4]	富坤, 应世聪, 郑婷婷, 屈佳捷, 崔静远, 李建伟. 面向小样本节点分类的图数据增强方法[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 392-402.
[5]	吕学强, 王涛, 游新冬, 徐戈. 层次融合多元知识的命名实体识别框架——HTLR[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 40-47.
[6]	孙焕良, 王思懿, 刘俊岭, 许景科. 社交媒体数据中水灾事件求助信息提取模型[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2437-2445.
[7]	于右任, 张仰森, 蒋玉茹, 黄改娟. 融合多粒度语言知识与层级信息的中文命名实体识别模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1706-1712.
[8]	余新言, 曾诚, 王乾, 何鹏, 丁晓玉. 基于知识增强和提示学习的小样本新闻主题分类方法[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1767-1774.
[9]	董永峰, 白佳明, 王利琴, 王旭. 融合先验知识和字形特征的中文命名实体识别[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 702-708.
[10]	黄子麒, 胡建鹏. 实体类别增强的汽车领域嵌套命名实体识别[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 377-384.
[11]	罗歆然, 李天瑞, 贾真. 基于自注意力机制与词汇增强的中文医学命名实体识别[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 385-392.
[12]	高颖杰, 林民, 斯日古楞null, 李斌, 张树钧. 基于片段抽取原型网络的古籍文本断句标点提示学习方法[J]. 《计算机应用》唯一官方网站, 2024, 44(12): 3815-3822.
[13]	谢莉, 舒卫平, 耿俊杰, 王琼, 杨海麟. 结合加权原型和自适应张量子空间的小样本宫颈细胞分类[J]. 《计算机应用》唯一官方网站, 2024, 44(10): 3200-3208.
[14]	于碧辉, 蔡兴业, 魏靖烜. 基于提示学习的小样本文本分类方法[J]. 《计算机应用》唯一官方网站, 2023, 43(9): 2735-2740.
[15]	姜钧舰, 刘达维, 刘逸凡, 任酉贵, 赵志滨. 基于孪生网络的小样本目标检测算法[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2325-2329.