Chinese named entity recognition model incorporating multi-granularity linguistic knowledge and hierarchical information

doi:10.11772/j.issn.1001-9081.2023060833

Journal of Computer Applications ›› 2024, Vol. 44 ›› Issue (6): 1706-1712.DOI: 10.11772/j.issn.1001-9081.2023060833

Special Issue: CCF第38届中国计算机应用大会 (CCF NCCA 2023)

• The 38th CCF National Conference of Computer Applications (CCF NCCA 2023) • Previous Articles Next Articles

Chinese named entity recognition model incorporating multi-granularity linguistic knowledge and hierarchical information

Youren YU, Yangsen ZHANG(), Yuru JIANG, Gaijuan HUANG

Institute of Intelligent Information Processing，Beijing Information Science and Technology University，Beijing 100101，China

Received:2023-06-28 Revised:2023-07-27 Accepted:2023-08-08 Online:2023-09-04 Published:2024-06-10
Contact: Yangsen ZHANG
About author:YU Youren， born in 1998， M.S. candidate. His research interests include natural language processing.
JIANG Yuru， born in 1978， Ph. D.， associate professor. Her research interests include natural language processing.
HUANG Gaijuan， born in 1964， senior experimentalist. Her research interests include natural language processing.
Supported by:
National Natural Science Foundation of China(62176023)

融合多粒度语言知识与层级信息的中文命名实体识别模型

于右任, 张仰森(), 蒋玉茹, 黄改娟

北京信息科技大学智能信息处理研究所，北京 100101

通讯作者: 张仰森
作者简介:于右任（1998—），男，河北石家庄人，硕士研究生，主要研究方向：自然语言处理
蒋玉茹（1978—），女，辽宁沈阳人，副教授，博士，CCF会员，主要研究方向：自然语言处理
黄改娟（1964—），女，山西临猗人，高级实验师，主要研究方向：自然语言处理。
基金资助:
国家自然科学基金资助项目(62176023)

Abstract

Abstract:

Aiming at the problem that most of the current Named Entity Recognition （NER） models only use character-level information encoding and lack text hierarchical information extraction， a Chinese NER （CNER） model incorporating Multi-granularity linguistic knowledge and Hierarchical information （CMH） was proposed. First， the text was encoded using a model that had been pre-trained with multi-granularity linguistic knowledge， so that the model could capture both fine-grained and coarse-grained linguistic information of the text， and thus better characterize the corpus. Second， hierarchical information was extracted using the ON-LSTM （Ordered Neurons Long Short-Term Memory network） model， in order to utilize the hierarchical structural information of the text itself and enhance the temporal relationships between codes. Finally， at the decoding end of the model， incorporated with the word segmentation Information of the text， the entity recognition problem was transformed into a table filling problem in order to better solve the entity overlapping problem and obtain more accurate entity recognition results. Meanwhile， in order to solve the problem of poor migration ability of the current models in different domains， the concept of universal entity recognition was proposed， and a set of universal NER dataset MDNER （Multi-Domain NER dataset） was constructed to enhance the generalization ability of the model in multiple domains by filtering the universal entity types in multiple domains. To validate the effectiveness of the proposed model， experiments were conducted on the datasets Resume， Weibo， and MSRA， and the F1 values were improved by 0.94， 4.95 and 1.58 percentage points， respectively， compared to the MECT （Multi-metadata Embedding based Cross-Transformer） model. In order to verify the proposed model’s entity recognition effect in multi-domain， experiments were conducted on MDNER， and the F1 value reached 95.29%. The experimental results show that the pre-training of multi-granularity linguistic knowledge， the extraction of hierarchical structural information of the text， and the efficient pointer decoder are crucial for the performance promotion of the model.

Key words: Named Entity Recognition (NER), Natural Language Processing (NLP), knowledge graph construction, efficient pointer, generic entity

摘要：

针对当前大多数命名实体识别（NER）模型只使用字符级信息编码且缺乏对文本层次信息提取的问题，提出一种融合多粒度语言知识与层级信息的中文NER（CNER）模型（CMH）。首先，使用经过多粒度语言知识预训练的模型编码文本，使模型能够同时捕获文本的细粒度和粗粒度语言信息，从而更好地表征语料；其次，使用ON-LSTM（Ordered Neurons Long Short-Term Memory network）模型提取层级信息，利用文本本身的层级结构信息增强编码间的时序关系；最后，在模型的解码端结合文本的分词信息，并将实体识别问题转化为表格填充问题，以更好地解决实体重叠问题并获得更准确的实体识别结果。同时，为解决当前模型在不同领域中的迁移能力较差的问题，提出通用实体识别的理念，通过筛选多领域的通用实体类型，构建一套提升模型在多领域中的泛化能力的通用NER数据集MDNER（Multi-Domain NER dataset）。为验证所提模型的效果，在数据集Resume、Weibo、MSRA上进行实验，与MECT（Multi-metadata Embedding based Cross-Transformer）模型相比，F1值分别提高了0.94、4.95和1.58个百分点。为了验证所提模型在多领域中的实体识别效果，在MDNER上进行实验，F1值达到了95.29%。实验结果表明，多粒度语言知识预训练、文本层级结构信息提取和高效指针解码器对模型的性能提升至关重要。

关键词: 命名实体识别, 自然语言处理, 知识图谱构建, 高效指针, 通用实体

CLC Number:

TP391.1

Youren YU, Yangsen ZHANG, Yuru JIANG, Gaijuan HUANG. Chinese named entity recognition model incorporating multi-granularity linguistic knowledge and hierarchical information[J]. Journal of Computer Applications, 2024, 44(6): 1706-1712.

于右任, 张仰森, 蒋玉茹, 黄改娟. 融合多粒度语言知识与层级信息的中文命名实体识别模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1706-1712.

Figures/Tables 9

References 30

1	曹宗胜，许倩倩，李朝鹏，等.基于对偶四元数的协同知识图谱推荐模型［J］.计算机学报，2022，45（10）：2221-2242.
	CAO Z S， XU Q Q， LI Z P， et al. Dual quaternion based collaborative knowledge graph modeling for recommendation［J］. Chinese Journal of Computers， 2022，45（10）：2221-2242.
2	BRANDSEN A， VERBERNE S， LAMBERS K， et al. Can BERT dig it？ Named entity recognition for information retrieval in the archaeology domain［J］. Journal on Computing and Cultural Heritage， 2022， 15（3）： No. 51.
3	XIA Q， ZHANG B， WANG R， et al. A unified span-based approach for opinion mining with syntactic constituents［C］//Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics： Human Language Technologies. Stroudsburg： ACL， 2021：1795-1804.
4	SONG C H， LAWRIE D J， FININ T， et al. Improving neural named entity recognition with gazetteers［EB/OL］. ［2023-05-30］. .
5	JIANG B， WU Z， KARIMI H R. A distributed dynamic event-triggered mechanism to HMM-based observer design for H∞ sliding mode control of Markov jump systems ［J］. Automatica， 2022， 142： 110357.
6	HEARST M A， DUMAIS S T， OSUNA E， et al. Support vector machines［J］. IEEE Intelligent Systems and their Applications， 1998， 13（4）： 18-28.
7	DONG C， ZHANG J， ZONG C， et al. Character-based LSTM-CRF with radical-level features for Chinese named entity recognition ［C］// Proceedings of the 5th CCF Conference on Natural Language Processing and Chinese Computing， and 24th International Conference on Computer Processing of Oriental Languages. Cham： Springer， 2016： 239-250.
8	SHEN Y， TAN S， SORDONI A， et al. Ordered neurons： integrating tree structures into recurrent neural networks ［EB/OL］. ［2023-05-30］. .
9	HUANG Z， XU W， YU K. Bidirectional LSTM-CRF models for sequence tagging ［EB/OL］. ［2023-07-24］. .
10	CHIU J P C， NICHOLS E. Named entity recognition with bidirectional LSTM-CNNs［J］. Transactions of the Association for Computational Linguistics， 2016， 4： 357-370.
11	PETERS M E， AMMAR W， BHAGAVATULA C， et al. Semi-supervised sequence tagging with bidirectional language models［C］// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics （Volume 1： Long Papers）. Stroudsburg： ACL， 2017： 1756-1765.
12	ZHANG Y， YANG J. Chinese NER using lattice LSTM［C］// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics （Volume 1： Long Papers）. Stroudsburg： ACL， 2018： 1554-1564.
13	WU F， LIU J， WU C， et al. Neural Chinese named entity recognition via CNN-LSTM-CRF and joint training with word segmentation［C］// Proceedings of the 2019 World Wide Web Conference. New York： ACM， 2019： 3342-3348.
14	MA R， PENG M， ZHANG Q， et al. Simplify the usage of lexicon in Chinese NER ［C］// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg： ACL， 2020： 5951-5960.
15	YANG S， TU K. Bottom-up constituency parsing and nested named entity recognition with pointer networks［C］// Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics （Volume 1： Long Papers）. Stroudsburg： ACL， 2022： 2403-2416.
16	ZHU E， LI J. Boundary smoothing for named entity recognition［C］// Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics （Volume 1： Long Papers）. Stroudsburg： ACL， 2022： 7096-7108.
17	SONG Y， ZHANG T， WANG Y， et al. ZEN 2.0： continue training and adaption for n-gram enhanced text encoders ［EB/OL］. ［2023-07-24］. .
18	DEVLIN J， CHANG M-W， LEE K， et al. BERT： pre-training of deep bidirectional transformers for language understanding［C］// Proceedings of the 2019 Conference on the North American Chapter of the Association for Computational Linguistics（Volume 1： Long Papers）. Stroudsburg： ACL， 2019： 4171-4186.
19	WU Z， YING C， ZHAO F， et al. Grid tagging scheme for aspect-oriented fine-grained opinion extraction ［C］// Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. Stroudsburg： ACL， 2020： 2576-2585.
20	XIAO D， LI Y-K， ZHANG H， et al. ERNIE-Gram： pre-training with explicitly n-gram masked language modeling for natural language understanding［C］// Proceedings of the 2021 Annual Conference of the North American Chapter of the Association for Computational Linguistics： Human Language Technologies. Stroudsburg： ACL， 2021： 1702-1715.
21	SU J， LU Y， PAN S， et al. RoFormer： enhanced transformer with rotary position embedding ［EB/OL］. ［2023-07-24］. .
22	PENG N， DREDZE M. Named entity recognition for Chinese social media with jointly trained embeddings［C］// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Stroudsburg： ACL， 2017： 548-554.
23	G-A LEVOW. The third international Chinese language processing bakeoff： word segmentation and named entity recognition［C］// Proceedings of the 5th SIGHAN Workshop on Chinese Language Processing. Stroudsburg： ACL， 2006： 108-117.
24	MIN K， MA C， ZHAO T， et al. BosonNLP： an ensemble approach for word segmentation and POS tagging ［C］// Proceedings of the 4th CCF Conference on Natural Language Processing and Chinese Computing. Cham： Springer， 2015： 520-526.
25	XU L， TONG Y， DONG Q， et al. CLUENER2020： fine-grained named entity recognition dataset and benchmark for Chinese ［EB/OL］. ［2023-07-24］. .
26	LI X， YAN H， QIU X， et al. FLAT： Chinese NER using flat-lattice Transformer ［C］// Proceedings of the 58th Annual Meeting of the Assocation for Computational Linguistics. Stroudsburg： ACL， 2020： 6836-6842.
27	MENGGE X， YU B， LIU T， et al. Porous lattice-based transformer encoder for Chinese NER ［C］// Proceedings of the 28th International Conference on Computational Linguistics. ［S.l.］： International Committee on Computational Linguistics， 2020： 3831-3841.
28	HU D， WEI L. SLK-NER： exploiting second-order lexicon knowledge for Chinese NER ［C/OL］// Proceedings of the 32nd International Conference on Software Engineering and Knowledge Engineering. （2020-07-16）［2023-05-30］. .
29	WU S， SONG X， FENG Z， et al. NFLAT： non-flat-lattice transformer for Chinese named entity recognition ［EB/OL］. ［2023-07-24］. .
30	WU S， SONG X， FENG Z. MECT： multi-metadata embedding based cross-transformer for Chinese named entity recognition ［C］// Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. Stroudsburg： ACL， 2021： 1529-1539.

数据集	数据集字符数	实体类别数	类型	训练集	验证集	测试集
Resume	153 000	8	句子数	3 800	460	480
			实体数	13 400	1 500	1 630
			字符数	124 100	13 900	15 100
Weibo	103 000	4	句子数	1 350	270	270
			实体数	1 890	390	420
			字符数	73 800	14 500	14 800
MSRA	10 400 000	3	句子数	46 360	—	4 300
			实体数	74 700	—	6 200
			字符数	2 169 900	—	172 600

数据集	数据集字符数	实体类别数	类型	训练集	验证集	测试集
Resume	153 000	8	句子数	3 800	460	480
			实体数	13 400	1 500	1 630
			字符数	124 100	13 900	15 100
Weibo	103 000	4	句子数	1 350	270	270
			实体数	1 890	390	420
			字符数	73 800	14 500	14 800
MSRA	10 400 000	3	句子数	46 360	—	4 300
			实体数	74 700	—	6 200
			字符数	2 169 900	—	172 600

参数	值	参数	值
Batch Size	64	Weight decay	10^-3
Learning rate1	5×10^-5	Max len	256
Learning rate2	10^-3	Optimizer	AdamW
Dropout rate	0.1

参数	值	参数	值
Batch Size	64	Weight decay	10^-3
Learning rate1	5×10^-5	Max len	256
Learning rate2	10^-3	Optimizer	AdamW
Dropout rate	0.1

数据集	模型	准确率	召回率	F1值
Resume	SoftLexicon LSTM^［14］	95.30	95.77	95.53
	Lattice LSTM^［12］	94.81	94.11	94.46
	FLAT^［26］	—	—	95.45
	BERT^［18］	94.20	95.80	95.00
	PLTE^［27］	—	—	95.40
	SLK-NER^［28］	95.20	96.40	95.80
	NFLAT^［29］	95.63	95.22	95.58
	MECT^［30］	96.40	95.39	95.89
	ZEN 2.0^［17］	95.34	96.17	95.75
	本文模型	96.79	96.86	96.83
Weibo	SoftLexicon LSTM^［14］	59.68	62.22	61.42
	Lattice LSTM^［12］	53.04	62.25	58.79
	FLAT^［26］	—	—	60.32
	BERT^［18］	61.20	63.90	62.50
	PLTE^［27］	—	—	59.76
	SLK-NER^［28］	61.80	66.30	64.00
	NFLAT^［29］	59.10	63.16	61.94
	MECT^［30］	61.91	62.51	63.30
	本文模型	74.31	63.11	68.25
MSRA	SoftLexicon LSTM^［14］	94.63	92.70	93.66
	Lattice LSTM^［12］	93.57	92.79	93.18
	FLAT^［26］	—	—	94.12
	BERT^［18］	94.43	93.86	94.14
	PLTE^［27］	—	—	93.26
	NFLAT^［29］	94.92	94.19	94.55
	MECT^［30］	94.55	94.09	94.32
	本文模型	95.95	95.85	95.90

Chinese named entity recognition model incorporating multi-granularity linguistic knowledge and hierarchical information

融合多粒度语言知识与层级信息的中文命名实体识别模型

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 9

References 30

Related Articles 15

Recommended Articles

Metrics

模型	准确率	召回率	F1值
BERT-Span	79.80	94.18	86.40
BERT-LSTM-CRF	93.31	94.21	93.76
BERT-LSTM-高效指针	95.28	94.49	94.88
本文模型	95.38	95.20	95.29

[1]	Qi SHUAI, Hairui WANG, Guifu ZHU. Chinese story ending generation model based on bidirectional contrastive training [J]. Journal of Computer Applications, 2024, 44(9): 2683-2688.
[2]	Jie WU, Ansi ZHANG, Maodong WU, Yizong ZHANG, Congbao WANG. Overview of research and application of knowledge graph in equipment fault diagnosis [J]. Journal of Computer Applications, 2024, 44(9): 2651-2659.
[3]	Huanliang SUN, Siyi WANG, Junling LIU, Jingke XU. Help-seeking information extraction model for flood event in social media data [J]. Journal of Computer Applications, 2024, 44(8): 2437-2445.
[4]	Quanmei ZHANG, Runping HUANG, Fei TENG, Haibo ZHANG, Nan ZHOU. Automatic international classification of disease coding method incorporating heterogeneous information [J]. Journal of Computer Applications, 2024, 44(8): 2476-2482.
[5]	Longtao GAO, Nana LI. Aspect sentiment triplet extraction based on aspect-aware attention enhancement [J]. Journal of Computer Applications, 2024, 44(4): 1049-1057.
[6]	Xianfeng YANG, Yilei TANG, Ziqiang LI. Aspect-level sentiment analysis model based on alternating‑attention mechanism and graph convolutional network [J]. Journal of Computer Applications, 2024, 44(4): 1058-1064.
[7]	Baoshan YANG, Zhi YANG, Xingyuan CHEN, Bing HAN, Xuehui DU. Analysis of consistency between sensitive behavior and privacy policy of Android applications [J]. Journal of Computer Applications, 2024, 44(3): 788-796.
[8]	Yongfeng DONG, Jiaming BAI, Liqin WANG, Xu WANG. Chinese named entity recognition combining prior knowledge and glyph features [J]. Journal of Computer Applications, 2024, 44(3): 702-708.
[9]	Kaitian WANG, Qing YE, Chunlei CHENG. Classification method for traditional Chinese medicine electronic medical records based on heterogeneous graph representation [J]. Journal of Computer Applications, 2024, 44(2): 411-417.
[10]	Yushan JIANG, Yangsen ZHANG. Large language model-driven stance-aware fact-checking [J]. Journal of Computer Applications, 2024, 44(10): 3067-3073.
[11]	Chenghao FENG, Zhenping XIE, Bowen DING. Selective generation method of test cases for Chinese text error correction software [J]. Journal of Computer Applications, 2024, 44(1): 101-112.
[12]	Xinyue ZHANG, Rong LIU, Chiyu WEI, Ke FANG. Aspect-based sentiment analysis method with integrating prompt knowledge [J]. Journal of Computer Applications, 2023, 43(9): 2753-2759.
[13]	Xiaomin ZHOU, Fei TENG, Yi ZHANG. Automatic international classification of diseases coding model based on meta-network [J]. Journal of Computer Applications, 2023, 43(9): 2721-2726.
[14]	Xiaoyan ZHANG, Zhengyu DUAN. Cross-lingual zero-resource named entity recognition model based on sentence-level generative adversarial network [J]. Journal of Computer Applications, 2023, 43(8): 2406-2411.
[15]	Zexi JIN, Lei LI, Ji LIU. Transfer learning model based on improved domain separation network [J]. Journal of Computer Applications, 2023, 43(8): 2382-2389.

模型	Resume			Weibo			MSRA
模型	准确率	召回率	F1值	准确率	召回率	F1值	准确率	召回率	F1值
本文模型	96.79	96.86	96.83	74.31	63.11	68.25	95.95	95.85	95.90
-ERNIE-Gram	95.76	96.50	96.13	73.29	61.61	66.95	95.78	95.18	95.48
-ON-LSTM	96.18	96.78	96.48	73.80	60.15	66.28	95.03	96.17	95.60
-高效指针	95.92	95.45	95.69	72.50	62.50	67.13	95.79	95.41	95.60

模型	Resume			Weibo			MSRA
模型	准确率	召回率	F1值	准确率	召回率	F1值	准确率	召回率	F1值
本文模型	96.79	96.86	96.83	74.31	63.11	68.25	95.95	95.85	95.90
-ERNIE-Gram	95.76	96.50	96.13	73.29	61.61	66.95	95.78	95.18	95.48
-ON-LSTM	96.18	96.78	96.48	73.80	60.15	66.28	95.03	96.17	95.60
-高效指针	95.92	95.45	95.69	72.50	62.50	67.13	95.79	95.41	95.60