Medical named entity recognition model based on deep auto-encoding

doi:10.11772/j.issn.1001-9081.2021071317

Abstract

Abstract:

With the deepening of the network in the Medical Named Entity Recognition （MNER） problem， the recognition accuracy and computing power requirements of the deep learning-based recognition models are unbalanced. Aiming at this problem， a medical named entity recognition model CasSAttMNER （Cascade Self-Attention Medical Named Entity Recognition） based on deep auto-encoding was proposed. Firstly， a depth difference balance strategy between encoding and decoding was used in the model， and the distilled Transformer language model RBT6 was used as the encoder to reduce the encoding depth and the computing power requirements for training and application. Then， Bidirectional Long Short-Term Memory （BiLSTM） network and Conditional Random Field （CRF） were used to propose a cascaded multi-task dual decoder to complete entity mention sequence labeling and entity class determination. Finally， based on the self-attention mechanism， the model design was optimized by effectively representing the implicit decoding information between the entity classes and the entity mentions. Experimental results show that the F value measurements of CasSAttMNER on two Chinese medical entity datasets can reach 0.943 9 and 0.945 7， which are 3 percentage points and 8 percentage points higher than those of the baseline model， respectively， verifying that this model further improves the decoder performance.

Key words: named entity recognition, auto-encoding network, Bidirectional Long Short-Term Memory (BiLSTM) network, attention mechanism, multi-task

摘要：

针对在医疗命名实体识别（MNER）问题中随着网络加深，基于深度学习的识别模型出现的识别精度与算力要求不平衡的问题，提出一种基于深度自编码的医疗命名实体识别模型CasSAttMNER。首先，使用编码与解码间深度差平衡策略，以经过蒸馏的Transformer语言模型RBT6作为编码器以减小编码深度以及降低对训练和应用上的算力要求；然后，使用双向长短期记忆（BiLSTM）网络和条件随机场（CRF）提出了级联式多任务双解码器，从而完成实体提及序列标注与实体类别判断；最后，基于自注意力机制在实体类别中增加实体提及过程抽取的隐解码信息，以此来优化模型设计。实验结果表明，CasSAttMNER在两个中文医疗实体数据集上的F值度量可分别达到0.943 9和0.945 7，较基线模型分别提高了3个百分点和8个百分点，验证了该模型更进一步地提升了解码器性能。

关键词: 命名实体识别, 自编码网络, 双向长短期记忆网络, 注意力机制, 多任务

CLC Number:

TP389.1

Xudong HOU, Fei TENG, Yi ZHANG. Medical named entity recognition model based on deep auto-encoding[J]. Journal of Computer Applications, 2022, 42(9): 2686-2692.

侯旭东, 滕飞, 张艺. 基于深度自编码的医疗命名实体识别模型[J]. 《计算机应用》唯一官方网站, 2022, 42(9): 2686-2692.

Figures/Tables 5

References 37

1	中华人民共和国国家卫生和计划生育委员会. 电子病历基本数据集标准：［S］. 北京：中国标准出版社， 2014. 10.3969/j.issn.1672-7185.2019.02.002
	National Health and Family Planning Commission of the People’s Republic of China. Standard for basic data sets of electronic medical record：［S］. Beijing： China Standard Press， 2014. 10.3969/j.issn.1672-7185.2019.02.002
2	国家卫生健康委办公厅. 关于印发电子病历应用管理规范（试行）的通知［EB/OL］. （2017-02-23）［2021-05-14］.. 10.31901/24566764.2014/05.02.02
	General Office of the National Health Commission. Notice on printing and distributing the management standards for the application of electronic medical records （for trial implementation）［EB/OL］. （2017-02-23）［2021-05-14］.. 10.31901/24566764.2014/05.02.02
3	国家卫生健康委办公厅. 关于印发电子病历系统应用水平分级评价管理办法（试行）及评价标准（试行）的通知［EB/OL］. （2018-12-09）［2021-05-14］.. 10.37544/0720-5953-2018-09-12
	General Office of the National Health Commission. Notice on issuing the administrative measures （trial） and evaluation standards （trial） for the application level evaluation of the electronic medical record system［EB/OL］. （2018-12-09）［2021-05-14］.. 10.37544/0720-5953-2018-09-12
4	BODENREIDER O. The Unified Medical Language System （UMLS）： integrating biomedical terminology［J］. Nucleic Acids Research， 2004， 32（S1）： D267-D270. 10.1093/nar/gkh061
5	PATRICK J， LI M. High accuracy information extraction of medication information from clinical notes： 2009 I2B2 medication extraction challenge［J］. Journal of the American Medical Informatics Association， 2010， 17（5）： 524-527. 10.1136/jamia.2010.003939
6	UZUNER Ö， SOUTH B R， SHEN S Y， et al. 2010 I2B2/VA challenge on concepts， assertions， and relations in clinical text［J］. Journal of the American Medical Informatics Association， 2011， 18（5）： 552-556. 10.1136/amiajnl-2011-000203
7	SUN W Y， RUMSHISKY A， UZUNER O. Evaluating temporal relations in clinical text： 2012 I2B2 challenge［J］. Journal of the American Medical Informatics Association， 2013， 20（5）： 806-813. 10.1136/amiajnl-2013-001628
8	STUBBS A， KOTFILA C， UZUNER Ö. Automated systems for the de-identification of longitudinal clinical narratives： overview of 2014 I2B2/UTHealth shared task Track 1［J］. Journal of Biomedical Informatics， 2015， 58（S）： S11-S19. 10.1016/j.jbi.2015.06.007
9	杨锦锋，关毅，何彬，等. 中文电子病历命名实体和实体关系语料库构建［J］. 软件学报， 2016， 27（11）：2725-2746. 10.13328/j.cnki.jos.004880
	YANG J F， GUAN Y， HE B， et al. Corpus construction for named entities and entity relations on Chinese electronic medical records［J］. Journal of Software， 2016， 27（11）：2725-2746. 10.13328/j.cnki.jos.004880
10	CUI Y M， CHENG W X， LIU T， et al. Revisiting pre-trained models for Chinese natural language processing［C］// Proceedings of the Findings of the Association for Computational Linguistics： EMNLP 2020. Stroudsburg， PA： Association for Computational Linguistics， 2020：657-668. 10.18653/v1/2020.findings-emnlp.58
11	COLLINS M， SINGER Y. Unsupervised models for named entity classification［C］// Proceedings of the 1999 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora. Stroudsburg， PA： Association for Computational Linguistics， 1999：100-110.
12	TROTT P A. International classification of diseases for oncology［J］. Journal of Clinical Pathology， 1977， 30（8）： 782-782. 10.1136/jcp.30.8.782-c
13	CORNET R， DE KEIZER N. Forty years of SNOMED： a literature review［J］. BMC Medical Informatics and Decision Making， 2008， 8（S1）： No.S2. 10.1186/1472-6947-8-s1-s2
14	FRIEDMAN C， ALDERSON P O， AUSTIN J H M， et al. A general natural-language text processor for clinical radiology［J］. Journal of the American Medical Informatics Association， 1994， 1（2）： 161-174. 10.1136/jamia.1994.95236146
15	CODEN A， SAVOVA G， SOMINSKY I， et al. Automatically extracting cancer disease characteristics from pathology reports into a Disease Knowledge Representation Model［J］. Journal of Biomedical Informatics， 2009， 42（5）： 937-949. 10.1016/j.jbi.2008.12.005
16	SAVOVA G K， MASANZ J J， OGREN P V， et al. Mayo clinical Text Analysis and Knowledge Extraction System （cTAKES）： architecture， component evaluation and applications［J］. Journal of the American Medical Informatics Association， 2010， 17（5）： 507-513. 10.1136/jamia.2009.001560
17	LI D C， KIPPER-SCHULER K， SAVOVA G. Conditional random fields and support vector machines for disorder named entity recognition in clinical texts［C］// Proceedings of the 2008 Workshop on Current Trends in Biomedical Natural Language Processing. Stroudsburg， PA： Association for Computational Linguistics， 2008： 94-95. 10.3115/1572306.1572326
18	CLARK C， ABERDEEN J， COARR M， et al. MITRE system for clinical assertion status classification［J］. Journal of the American Medical Informatics Association， 2011， 18（5）： 563-567. 10.1136/amiajnl-2011-000164
19	JONNALAGADDA S， COHEN T， WU S， et al. Enhancing clinical concept extraction with distributional semantics［J］. Journal of Biomedical Informatics， 2012， 45（1）： 129-140. 10.1016/j.jbi.2011.10.007
20	WU Y H， JIANG M， LEI J B， et al. Named entity recognition in Chinese clinical text using deep neural network［J］. Studies in Health Technology and Informatics， 2015， 216： 624-628. 10.1136/amiajnl-2013-002381
21	HUANG Z H， XU W， YU K. Bidirectional LSTM-CRF models for sequence tagging［EB/OL］. （2015-08-09）［2021-05-14］..
22	XU K， ZHOU Z F， HAO T Y， et al. A bidirectional LSTM and conditional random fields approach to medical named entity recognition［C］// Proceedings of the 2017 International Conference on Advanced Intelligent Systems and Informatics， AISC 639. Cham： Springer， 2017： 355-365.
23	JI B， LIU R， LI S S， et al. A hybrid approach for named entity recognition in Chinese electronic medical record［J］. BMC Medical Informatics and Decision Making， 2019， 19（S2）： No.64. 10.1186/s12911-019-0767-2
24	BAEVSKI A， EDUNOV S， LIU Y H， et al. Cloze-driven pretraining of self-attention networks［C］// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Stroudsburg， PA： Association for Computational Linguistics， 2019： 5360-5369. 10.18653/v1/d19-1539
25	LIU Y J， MENG F D， ZHANG J C， et al. GCDT： a global context enhanced deep transition architecture for sequence labeling［C］// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics， Stroudsburg. Stroudsburg， PA： Association for Computational Linguistics， 2019： 2431-2441. 10.18653/v1/p19-1233
26	LI J， YE D H， SHANG S. Adversarial transfer for named entity boundary detection with pointer networks［C］// Proceedings of the 28th International Joint Conference on Artificial Intelligence. California： ijcai.org， 2019： 5053-5059. 10.24963/ijcai.2019/702
27	BALDI P， SADOWSKI P. The dropout learning algorithm［J］. Artificial Intelligence， 2014， 210： 78-122. 10.1016/j.artint.2014.02.004
28	SUTTON C， McCALLUM A. An introduction to conditional random fields for relational learning［M］// GETOOR L， TASKAR B. Introduction to Statistical Relational Learning. Cambridge： MIT Press， 2007： 93-127. 10.7551/mitpress/7432.003.0006
29	PARIKH A， TÄCKSTRÖM O， DAS D， et al. A decomposable attention model for natural language inference［C］// Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Stroudsburg， PA： Association for Computational Linguistics， 2016： 2249-2255. 10.18653/v1/d16-1244
30	医渡云. Yidu-S4K：医渡云结构化4K数据集［DS/OL］. （2020-11-09）［2021-05-14］..
	Cloud Yidu. Yidu-S4K： Yidu Cloud structured 4K data set［DS/OL］. （2020-11-09）［2021-05-14］..
31	2020全国知识图谱与语义计算大会. CCKS评测任务CFP［EB/OL］. ［2021-05-14］.，2020. 10.1155/2021/8884282
	2020 China Conference on Knowledge Graph and Semantic Computing. CCKS evaluation task CFP［EB/OL］. ［2021-05-14］.，2020. 10.1155/2021/8884282
32	KINGMA D P， BA J L. Adam： a method for stochastic optimization［EB/OL］. （2017-01-30）［2021-05-22］..
33	乔锐，杨笑然，黄文亢.基于BERT与模型融合的医疗命名实体识别［EB/OL］.［2021-05-14］.. 10.1145/3490322.3490336
	QIAO R， YANG X R， HUANG W K. Medical named entity recognition based on BERT and model fusion［EB/OL］. ［2021-05-14］.. 10.1145/3490322.3490336
34	LI N， LUO L， DING Z Y， et al. DUTIR at the CCKS-2019 task1： improving Chinese clinical named entity recognition using stroke ELMo and transfer learning［EB/OL］. ［2021-05-14］..
35	晏阳天，赵新宇，吴贤. 基于BERT与字形字音特征的医疗命名实体识别［EB/OL］. ［2021-05-14］..
	YAN Y T， ZHAO X Y， WU X. Medical named entity recognition based on BERT and character pattern and phonetic features［EB/OL］. ［2021-05-14］..
36	杨文明，毕金良，邹佳丽，等. 基于 ChiEHRBert 与多模型融合的医疗命名实体识别［EB/OL］. ［2021-05-14］..
	YANG W M， BI J L， ZOU J L， et al. Medical named entity recognition based on ChiENRBert and multi-model fusion［EB/OL］. ［2021-05-14］..
37	ZHENG H Y， WEN R， CHEN X， et al. Medical named entity recognition using CRF-MT-Adapt and NER-MRC［EB/OL］. ［2021-05-14］.. 10.1109/cds52072.2021.00068

模型	数据集
模型	CCKS-19	CCKS-20
文献［33］模型	0.856 2
文献［34］模型	0.851 6
模型融合+规则^［35］		0.915 4
ChiEHRBert+实体融合^［36］		0.912 4
Ensemble^［37］		0.905 1
CasSAttMNER	0.9439	0.9457

模型	数据集
模型	CCKS-19	CCKS-20
文献［33］模型	0.856 2
文献［34］模型	0.851 6
模型融合+规则^［35］		0.915 4
ChiEHRBert+实体融合^［36］		0.912 4
Ensemble^［37］		0.905 1
CasSAttMNER	0.9439	0.9457

数据集	模型	实体类别
数据集	模型	疾病和诊断	实验室检验	手术	药物	解剖部位	影像检查
CCKS-19	文献［33］模型	0.842 9	0.769 4	0.833 3	0.9602	0.861 8	0.862 9
	文献［34］模型	0.828 1	0.756 5	0.867 9	0.944 9	0.859 9	0.880 1
	CasSAttMNER	0.9429	0.9306	0.9091	0.912 9	0.9549	0.9741
CCKS-20	模型融合+规则^［35］	0.905 3	0.835 0	0.9621	0.937 5	0.920 0	0.884 7
	实体融合^［36］	0.911 0	0.857 1	0.955 2	0.929 3	0.911 6	0.886 2
	Ensemble^［37］	0.899 2	0.850 3	0.937 5	0.931 0	0.904 3	0.876 9
	CasSAttMNER	0.9262	0.9542	0.932 2	0.9401	0.9565	0.9600

数据集	模型	实体类别
数据集	模型	疾病和诊断	实验室检验	手术	药物	解剖部位	影像检查
CCKS-19	文献［33］模型	0.842 9	0.769 4	0.833 3	0.9602	0.861 8	0.862 9
	文献［34］模型	0.828 1	0.756 5	0.867 9	0.944 9	0.859 9	0.880 1
	CasSAttMNER	0.9429	0.9306	0.9091	0.912 9	0.9549	0.9741
CCKS-20	模型融合+规则^［35］	0.905 3	0.835 0	0.9621	0.937 5	0.920 0	0.884 7
	实体融合^［36］	0.911 0	0.857 1	0.955 2	0.929 3	0.911 6	0.886 2
	Ensemble^［37］	0.899 2	0.850 3	0.937 5	0.931 0	0.904 3	0.876 9
	CasSAttMNER	0.9262	0.9542	0.932 2	0.9401	0.9565	0.9600

[1]	Hongjun HENG, Tianbao XU. Attention sentiment analysis model based on multi-scale convolution and gating mechanism [J]. Journal of Computer Applications, 2022, 42(9): 2674-2679.
[2]	Jie HU, Yan HU, Mengchi LIU, Yan ZHANG. Chinese named entity recognition based on knowledge base entity enhanced BERT model [J]. Journal of Computer Applications, 2022, 42(9): 2680-2685.
[3]	Kai WEN, Weiwei TANG, Junchen XIONG. Real-time segmentation algorithm based on attention mechanism and effective factorized convolution [J]. Journal of Computer Applications, 2022, 42(9): 2659-2666.
[4]	Chengxia XU, Qing YAN, Teng LI, Kaichao MIAO. De-raining algorithm based on joint attention mechanism for single image [J]. Journal of Computer Applications, 2022, 42(8): 2578-2585.
[5]	Kun LI, Qing HOU. Lightweight human pose estimation based on attention mechanism [J]. Journal of Computer Applications, 2022, 42(8): 2407-2414.
[6]	Jian ZHANG, Peiyuan CHENG, Siyu SHAO. Rotary machine fault diagnosis based on improved residual convolutional auto-encoding network and class adaptation [J]. Journal of Computer Applications, 2022, 42(8): 2440-2449.
[7]	Minghui WU, Guangjie ZHANG, Canghong JIN. Time series prediction model based on multimodal information fusion [J]. Journal of Computer Applications, 2022, 42(8): 2326-2332.
[8]	Zhenhu LYU, Xinzheng XU, Fangyan ZHANG. Lightweight attention mechanism module based on squeeze and excitation [J]. Journal of Computer Applications, 2022, 42(8): 2353-2360.
[9]	Liying ZHANG, Chunjiang PANG, Xinying WANG, Guoliang LI. Multi-scale object detection algorithm based on improved YOLOv3 [J]. Journal of Computer Applications, 2022, 42(8): 2423-2431.
[10]	Xinyu ZHANG, Sheng DING, Zhipei YANG. Traffic sign detection algorithm based on improved attention mechanism [J]. Journal of Computer Applications, 2022, 42(8): 2378-2385.
[11]	Yinglü XUAN, Yuan WAN, Jiahui CHEN. Time series classification by LSTM based on multi-scale convolution and attention mechanism [J]. Journal of Computer Applications, 2022, 42(8): 2343-2352.
[12]	Bo LIU, Linbo QING, Zhengyong WANG, Mei LIU, Xue JIANG. Group activity recognition based on partitioned attention mechanism and interactive position relationship [J]. Journal of Computer Applications, 2022, 42(7): 2052-2057.
[13]	Haiqi WANG, Zhihai WANG, Liuke LI, Haoran KONG, Qiong WANG, Jianbo XU. Spatial-temporal prediction model of urban short-term traffic flow based on grid division [J]. Journal of Computer Applications, 2022, 42(7): 2274-2280.
[14]	Xiaohan LI, Jun WANG, Huading JIA, Liu XIAO. Stock market volatility prediction method based on graph neural network with multi-attention mechanism [J]. Journal of Computer Applications, 2022, 42(7): 2265-2273.
[15]	Wenjun FAN, Shuguang ZHAO, Lizheng GUO. Ship detection algorithm based on improved RetinaNet [J]. Journal of Computer Applications, 2022, 42(7): 2248-2255.