Few-shot named entity recognition based on decomposed fuzzy span

doi:10.11772/j.issn.1001-9081.2024050567

Journal of Computer Applications ›› 2025, Vol. 45 ›› Issue (5): 1504-1510.DOI: 10.11772/j.issn.1001-9081.2024050567

• Artificial intelligence • Previous Articles

Few-shot named entity recognition based on decomposed fuzzy span

Biqing ZENG¹(), Guangbin ZHONG¹, James Zhiqing WEN²

^1.School of Software，South China Normal University，Foshan Guangdong 528225，China
^2.Intelligent Robot Engineering Research Center，JIHUA Laboratory，Foshan Guangdong 528200，China

Received:2024-05-09 Revised:2024-07-18 Accepted:2024-07-19 Online:2024-07-25 Published:2025-05-10
Contact: Biqing ZENG
About author:ZENG Biqing， born in 1969， Ph. D.， professor. His research interests include natural language processing， artificial intelligence.
ZHONG Guangbin， born in 1998， M. S. candidate. His research interests include natural language processing， few-shot named entity recognition.
WEN James Zhiqing， born in 1964， Ph. D.， professor. His research interests include artificial intelligence， machine vision， optoelectronic technology， robotics.
Supported by:
National Natural Science Foundation of China(62076103);Guangdong Basic and Applied Basic Research Foundation(2021A1515011171);Guangzhou Basic Research Program Basic and Applied Basic Research Project(202102080282);Foshan Key Science and Technology Research Project(2020001006807)

基于分解式模糊跨度的小样本命名实体识别

曾碧卿¹(), 钟广彬¹, 温志庆²

^1.华南师范大学软件学院，广东佛山 528225
^2.季华实验室智能机器人工程研究中心，广东佛山 528200

通讯作者: 曾碧卿
作者简介:曾碧卿（1969—），男，湖南衡南人，教授，博士，CCF杰出会员，主要研究方向：自然语言处理、人工智能
钟广彬（1998—），男，广东梅州人，硕士研究生，主要研究方向：自然语言处理、小样本命名实体识别
温志庆（1964—），男，山东招远人，教授，博士，主要研究方向：人工智能、机器视觉、光电技术、机器人。
基金资助:
国家自然科学基金资助项目(62076103);广东省基础与应用基础研究基金资助项目(2021A1515011171);广州市基础研究计划基础与应用基础研究项目(202102080282);佛山市重点领域科技攻关项目(2020001006807)

Abstract

Abstract:

Few-shot Named Entity Recognition （few-shot NER） aims to identify entity spans and their types in text based on limited labeled data. Although span-based metric learning has achieved promising results in recent years， two challenges remain： first， prototypes may be pulled away from cluster centers due to sparse candidate spans； second， some non-entity spans may be produced by span detectors that are irrelevant to the categories. To address these issues， a decomposed model integrating fuzzy span， namely DFSM （Decomposed Fuzzy Span Model）， was proposed for few-shot NER. In the span detection stage， a global boundary matrix was used to detect candidate spans， enabling the learning of explicit entity boundary information without dependency on labels at token level. In the span classification stage， a fuzzy span strategy was proposed to adjust the boundary ranges of candidate spans， thereby increasing the number of trainable candidate spans for each entity type. Meanwhile， a prototypical contrastive learning was designed to optimize the span-based semantic representation space. Besides， prototypical boundary learning was introduced to enlarge the distance between non-entity spans and prototypes， eliminating interference from non-entity noisy data. Experimental results on Few-NERD and CrossNER datasets show that： compared to the baseline model TadNER， DFSM achieves an average F1-score gain of 8.52 percentage points under the Few-NERD Inter setting， with a notable 10.39 percentage points improvement in the Inter 10-way 5 - 10-shot scenario， highlighting its enhanced capability for fine-grained entity recognition； compared to the baseline model DecomMeta， DFSM achieves F1-score improvements of 3.32 and 1.09 percentage points in CrossNER 1-shot and CrossNER 5-shot setting， respectively， demonstrating the good generalization ability of DFSM in cross-domain low-resource scenarios.

Key words: Named Entity Recognition (NER), few-shot learning, prototypical network, global boundary matrix, fuzzy span

摘要：

小样本命名实体识别（few-shot NER）旨在基于少量标记数据识别文本中的实体跨度和类型。近年来，基于跨度的度量学习虽然取得了不错的效果，但仍然存在2个问题：一是少量的候选跨度可能导致原型偏离群组的中心；二是与类别无关的跨度检测器可能会产生一些非实体跨度。为了解决以上问题，提出一种用于few-shot NER的融合模糊跨度的分解式模型DFSM（Decomposed Fuzzy Span Model）。在跨度检测阶段，为学习明确的实体边界信息且不受标记级别的标签依赖影响，DFSM采用全局边界矩阵检测候选跨度；而在跨度分类阶段，为增加可训练的每种实体类型的候选跨度数量，提出一种模糊跨度策略，以调整候选跨度的边界范围。同时，设计一种原型对比学习以优化基于跨度的语义表示空间。此外，为消除非实体噪声数据的干扰，引入原型边界学习以扩大非实体跨度与原型的距离。在Few-NERD和CrossNER数据集上的实验结果显示：与基线模型TadNER相比，在Few-NERD Inter设置中，DFSM的平均F1值提升了8.52个百分点，尤其是在Inter 10 way 5~10-shot设置中，DFSM的平均F1值提升了10.39个百分点，这表明DFSM对于细粒度实体类型具有更强的识别能力；与基线模型DecomMeta相比，在CrossNER 1-shot和5-shot设置中，DFSM的平均F1值分别提升了3.32和1.09个百分点，这表明DFSM在跨领域低资源场景下具有良好的泛化能力。

关键词: 命名实体识别, 小样本学习, 原型网络, 全局边界矩阵, 模糊跨度

CLC Number:

TP18

Biqing ZENG, Guangbin ZHONG, James Zhiqing WEN. Few-shot named entity recognition based on decomposed fuzzy span[J]. Journal of Computer Applications, 2025, 45(5): 1504-1510.

曾碧卿, 钟广彬, 温志庆. 基于分解式模糊跨度的小样本命名实体识别[J]. 《计算机应用》唯一官方网站, 2025, 45(5): 1504-1510.

Figures/Tables 6

References 34

1	HUANG Z， XU W， YU K. Bidirectional LSTM-CRF models for sequence tagging［EB/OL］. ［2023-09-01］..
2	SANTORO A， BARTUNOV S， BOTVINICK M， et al. Meta-learning with memory-augmented neural networks［C］// Proceedings of the 33rd International Conference on Machine Learning. New York： JMLR.org， 2016： 1842-1850.
3	MA X， HOVY E. End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF［C］// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics （Volume 1： Long Papers）. Stroudsburg： ACL， 2016： 1064-1074.
4	LAMPLE G， BALLESTEROS M， SUBRAMANIAN S， et al. Neural architectures for named entity recognition［C］// Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics： Human Language Technologies. Stroudsburg： ACL， 2016： 260-270.
5	PETERS M E， AMMAR W， BHAGAVATULA C， et al. Semi-supervised sequence tagging with bidirectional language models［C］// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics （Volume 1： Long Papers）. Stroudsburg： ACL， 2017： 1756-1765.
6	DING N， XU G， CHEN Y， et al. Few-NERD： a few-shot named entity recognition dataset［C］// Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing （Volume 1： Long Papers）. Stroudsburg： ACL， 2021： 3198-3213.
7	HUANG J， LI C， SUBUDHI K， et al. Few-shot named entity recognition： an empirical baseline study［C］// Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Stroudsburg： ACL， 2021： 10408-10423.
8	MA T， JIANG H， WU Q， et al. Decomposed meta-learning for few-shot named entity recognition［C］// Findings of the Association for Computational Linguistics： ACL 2022. Stroudsburg： ACL， 2022： 1584-1596.
9	DAS S S S， KATIYAR A， PASSONNEAU R J， et al. CONTaiNER： few-shot named entity recognition via contrastive learning［C］// Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics （Volume 1： Long Papers）. Stroudsburg： ACL， 2022： 6338-6353.
10	HOU Y， CHE W， LAI Y， et al. Few-shot slot tagging with collapsed dependency transfer and label-enhanced task-adaptive projection network［C］// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg： ACL， 2020： 1381-1393.
11	ZIYADI M， SUN Y， GOSWAMI A， et al. Example-based named entity recognition［EB/OL］. ［2023-09-01］..
12	FRITZLER A， LOGACHEVA V， KRETOV M. Few-shot classification in named entity recognition task［C］// Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing. New York： ACM， 2019： 993-1000.
13	MA J， BALLESTEROS M， DOSS S， et al. Label semantics for few shot named entity recognition［C］// Findings of the Association for Computational Linguistics： ACL 2022. Stroudsburg： ACL， 2022： 1956-1971.
14	SHEN Y， MA X， TAN Z， et al. Locate and label： a two-stage identifier for nested named entity recognition［C］// Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing （Volume 1： Long Papers）. Stroudsburg： ACL， 2021： 2782-2794.
15	WANG P， XU R， LIU T， et al. An enhanced span-based decomposition method for few-shot sequence labeling［C］// Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics： Human Language Technologies. Stroudsburg： ACL， 2022： 5012-5024.
16	WU S， SHEN Y， TAN Z， et al. Propose-and-Refine： a two-stage set prediction network for nested named entity recognition［C］// Proceedings of the 31st International Joint Conference on Artificial Intelligence. San Francisco： Morgan Kaufmann Publishers Inc.， 2022： 4418-4424.
17	KULKARNI V， MEHDAD Y， CHEVALIER T. Domain adaptation for named entity recognition in online media with word embeddings［EB/OL］. ［2023-11-01］..
18	FINN C， ABBEEL P， LEVINE S. Model-agnostic meta-learning for fast adaptation of deep networks［C］// Proceedings of the 34th International Conference on Machine Learning. New York： PMLR， 2017： 1126-1135.
19	VINYALS O， BLUNDELL C， LILLICRAP T， et al. Matching networks for one shot learning［C］// Proceedings of the 30th International Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2016： 3637-3645.
20	SNELL J， SWERSKY K， ZEMEL R. Prototypical networks for few-shot learning［C］// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2017： 4080-4090.
21	RAVI S， LAROCHELLE H. Optimization as a model for few-shot learning［EB/OL］. ［2023-09-21］..
22	SUNG F， YANG Y， ZHANG L， et al. Learning to compare： relation network for few-shot learning［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 1199-1208.
23	DEVLIN J， CHANG M W， LEE K， et al. BERT： pre-training of deep bidirectional Transformers for language understanding［C］// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics： Human Language Technologies， Volume 1 （Long and Short Papers）. Stroudsburg： ACL， 2019： 4171-4186.
24	VASWANI A， SHAZEER N， PARMAR N， et al. Attention is all you need［C］// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2017： 6000-6010.
25	WANG J， WANG C， TAN C， et al. SpanProto： a two-stage span-based prototypical network for few-shot named entity recognition［C］// Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. Stroudsburg： ACL， 2022： 3466-3476.
26	SANG E F， DE MEULDER F. Introduction to the CoNLL-2003 shared task： language-independent named entity recognition［C］// Proceedings of the 7th Conference on Natural Language Learning at HLT-NAACL 2003. Stroudsburg： ACL， 2003： 142-147.
27	ZELDES A. The GUM corpus： creating multilayer resources in the classroom［J］. Language Resources and Evaluation， 2017， 51（3）： 581-612.
28	DERCZYNSKI L， NICHOLS E， VAN ERP M， et al. Results of the WNUT2017 shared task on novel and emerging entity recognition［C］// Proceedings of the 3rd Workshop on Noisy User-generated Text. Stroudsburg： ACL， 2017： 140-147.
29	PRADHAN S， MOSCHITTI A， XUE N， et al. Towards robust linguistic analysis using OntoNotes［C］// Proceedings of the 17th Conference on Computational Natural Language Learning. Stroudsburg： ACL， 2013： 143-152.
30	YANG Y， KATIYAR A. Simple and effective few-shot named entity recognition with structured nearest neighbor learning［C］// Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. Stroudsburg： ACL， 2020： 6365-6375.
31	戚荣志，周俊宇，李水艳，等.基于细粒度原型网络的小样本命名实体识别方法［J］.软件学报，2024，35（10）：4751-4765.
	QI R Z， ZHOU J Y， LI S Y， et al. Few-shot named entity recognition based on fine-grained prototypical networks［J］. Journal of Software， 2024， 35（10）： 4751-4765.
32	LI Y， YU Y， QIAN T. Type-aware decomposed framework for few-shot named entity recognition［C］// Findings of the Association for Computational Linguistics： EMNLP 2023. Stroudsburg： ACL， 2023： 8911-8927.
33	LOSHCHILOV I， HUTTER F. Decoupled weight decay regularization［EB/OL］. ［2024-09-01］..
34	VAN DER MAATEN L， HINTON G. Visualizing data using t-SNE［J］. Journal of Machine Learning Research， 2008， 9： 2579-2605.

模型	Intra				Inter
	1~2-shot		5~10-shot		1~2-shot		5~10-shot
	5 way	10 way	5 way	10 way	5 way	10 way	5 way	10 way
ProtoBERT	23.45	19.76	41.93	34.61	44.44	39.09	58.80	53.97
NNShot	31.01	21.88	35.74	27.67	54.29	46.98	50.56	50.00
StructShot	35.92	25.38	38.83	26.39	57.33	49.46	57.16	49.39
CONTaiNER	40.43	33.84	53.70	47.49	55.95	48.35	61.83	57.12
FNFP	33.14	24.44	54.80	48.24	51.87	46.14	63.86	60.91
ESD	41.44	32.29	50.68	42.92	66.46	59.95	74.14	67.91
DecomMeta	52.04	43.50	63.23	56.84	68.77	63.26	71.62	68.32
TadNER	60.78	55.44	67.94	60.87	64.83	64.06	72.12	69.94
DFSM	55.84	46.82	72.89	64.46	74.57	67.99	82.15	80.33

模型	Intra				Inter
	1~2-shot		5~10-shot		1~2-shot		5~10-shot
	5 way	10 way	5 way	10 way	5 way	10 way	5 way	10 way
ProtoBERT	23.45	19.76	41.93	34.61	44.44	39.09	58.80	53.97
NNShot	31.01	21.88	35.74	27.67	54.29	46.98	50.56	50.00
StructShot	35.92	25.38	38.83	26.39	57.33	49.46	57.16	49.39
CONTaiNER	40.43	33.84	53.70	47.49	55.95	48.35	61.83	57.12
FNFP	33.14	24.44	54.80	48.24	51.87	46.14	63.86	60.91
ESD	41.44	32.29	50.68	42.92	66.46	59.95	74.14	67.91
DecomMeta	52.04	43.50	63.23	56.84	68.77	63.26	71.62	68.32
TadNER	60.78	55.44	67.94	60.87	64.83	64.06	72.12	69.94
DFSM	55.84	46.82	72.89	64.46	74.57	67.99	82.15	80.33

模型	1-shot				5-shot
模型	News	Wiki	Social	Mixed	News	Wiki	Social	Mixed
TransferBERT	4.75	0.57	2.71	3.46	15.36	3.62	11.08	35.49
SimBERT	19.22	6.91	5.18	13.99	32.01	10.63	8.20	21.14
Matching Network	19.50	4.73	17.23	15.06	19.85	5.58	6.61	8.08
ProtoBERT	32.49	3.89	10.68	6.67	50.06	9.54	17.26	13.59
FNFP	24.50	8.61	10.89	26.90	51.58	16.10	20.62	29.91
L-TapNet+CDT	44.30	12.04	20.80	15.17	45.35	11.65	23.30	20.95
DecomMeta	46.09	17.54	25.14	34.13	58.18	31.36	31.02	45.55
DFSM	50.91	28.80	21.65	34.80	61.36	36.97	26.20	45.92

模型	1-shot				5-shot
模型	News	Wiki	Social	Mixed	News	Wiki	Social	Mixed
TransferBERT	4.75	0.57	2.71	3.46	15.36	3.62	11.08	35.49
SimBERT	19.22	6.91	5.18	13.99	32.01	10.63	8.20	21.14
Matching Network	19.50	4.73	17.23	15.06	19.85	5.58	6.61	8.08
ProtoBERT	32.49	3.89	10.68	6.67	50.06	9.54	17.26	13.59
FNFP	24.50	8.61	10.89	26.90	51.58	16.10	20.62	29.91
L-TapNet+CDT	44.30	12.04	20.80	15.17	45.35	11.65	23.30	20.95
DecomMeta	46.09	17.54	25.14	34.13	58.18	31.36	31.02	45.55
DFSM	50.91	28.80	21.65	34.80	61.36	36.97	26.20	45.92

模块	Intra	Inter
DFSM	55.84	74.57
DFSM w/o. FS	53.78	72.82
DFSM w/o. PCL	51.12	70.99
DFSM w/o. PML	52.67	71.58

Few-shot named entity recognition based on decomposed fuzzy span

基于分解式模糊跨度的小样本命名实体识别

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 6

References 34

Related Articles 15

Recommended Articles

Metrics

温度系数τ	F1/%
温度系数τ	Intra	Inter
τ = 0.05	72.89	81.13
τ = 0.10	72.62	81.45
τ = 0.20	72.28	81.81
τ = 0.40	72.06	82.15

[1]	Yiqin YAN, Chuan LUO, Tianrui LI, Hongmei CHEN. Cross-domain few-shot classification model based on relation network and Vision Transformer [J]. Journal of Computer Applications, 2025, 45(4): 1095-1103.
[2]	Kun FU, Shicong YING, Tingting ZHENG, Jiajie QU, Jingyuan CUI, Jianwei LI. Graph data augmentation method for few-shot node classification [J]. Journal of Computer Applications, 2025, 45(2): 392-402.
[3]	Xuewen YAN, Zhangjin HUANG. Few-shot image classification method based on contrast learning [J]. Journal of Computer Applications, 2025, 45(2): 383-391.
[4]	Binhong XIE, Wanyin GAO, Wangdong LU, Yingjun ZHANG, Rui ZHANG. Dense object counting network with few-shot similarity matching feature enhancement [J]. Journal of Computer Applications, 2025, 45(2): 403-410.
[5]	Xueqiang LYU, Tao WANG, Xindong YOU, Ge XU. HTLR： named entity recognition framework with hierarchical fusion of multi-knowledge [J]. Journal of Computer Applications, 2025, 45(1): 40-47.
[6]	Huanliang SUN, Siyi WANG, Junling LIU, Jingke XU. Help-seeking information extraction model for flood event in social media data [J]. Journal of Computer Applications, 2024, 44(8): 2437-2445.
[7]	Xinyan YU, Cheng ZENG, Qian WANG, Peng HE, Xiaoyu DING. Few-shot news topic classification method based on knowledge enhancement and prompt learning [J]. Journal of Computer Applications, 2024, 44(6): 1767-1774.
[8]	Youren YU, Yangsen ZHANG, Yuru JIANG, Gaijuan HUANG. Chinese named entity recognition model incorporating multi-granularity linguistic knowledge and hierarchical information [J]. Journal of Computer Applications, 2024, 44(6): 1706-1712.
[9]	Yongfeng DONG, Jiaming BAI, Liqin WANG, Xu WANG. Chinese named entity recognition combining prior knowledge and glyph features [J]. Journal of Computer Applications, 2024, 44(3): 702-708.
[10]	Yingjie GAO, Min LIN, Siriguleng, Bin LI, Shujun ZHANG. Prompt learning method for ancient text sentence segmentation and punctuation based on span-extracted prototypical network [J]. Journal of Computer Applications, 2024, 44(12): 3815-3822.
[11]	Li XIE, Weiping SHU, Junjie GENG, Qiong WANG, Hailin YANG. Few-shot cervical cell classification combining weighted prototype and adaptive tensor subspace [J]. Journal of Computer Applications, 2024, 44(10): 3200-3208.
[12]	Bihui YU, Xingye CAI, Jingxuan WEI. Few-shot text classification method based on prompt learning [J]. Journal of Computer Applications, 2023, 43(9): 2735-2740.
[13]	Xiaomin ZHOU, Fei TENG, Yi ZHANG. Automatic international classification of diseases coding model based on meta-network [J]. Journal of Computer Applications, 2023, 43(9): 2721-2726.
[14]	Xiaoyan ZHANG, Zhengyu DUAN. Cross-lingual zero-resource named entity recognition model based on sentence-level generative adversarial network [J]. Journal of Computer Applications, 2023, 43(8): 2406-2411.
[15]	Junjian JIANG, Dawei LIU, Yifan LIU, Yougui REN, Zhibin ZHAO. Few-shot object detection algorithm based on Siamese network [J]. Journal of Computer Applications, 2023, 43(8): 2325-2329.