Alarm text named entity recognition based on BERT

doi:10.11772/j.issn.1001-9081.2019101717

Journal of Computer Applications ›› 2020, Vol. 40 ›› Issue (2): 535-540.DOI: 10.11772/j.issn.1001-9081.2019101717

• CCF Bigdata 2019 • Previous Articles Next Articles

Alarm text named entity recognition based on BERT

Yue WANG¹^,², Mengxuan WANG¹^,², Sheng ZHANG¹^,², Wen DU¹^,²()

^1.DS Information Technology Company Limited，Shanghai 200032，China
^2.The First Research Institute of Telecommunications Technology，Shanghai 200032，China

Received:2019-08-20 Revised:2019-10-21 Accepted:2019-10-24 Online:2019-10-31 Published:2020-02-10
Contact: Wen DU
About author:WANG Yue， born in 1994， M. S. candidate. His research interests include machine learning， natural language processing.
WANG Mengxuan， born in 1993， M. S. candidate. His research interests include text classification， sentiment analysis.
ZHANG Sheng， born in 1994， M. S. candidate. His research interests include deep learning， public opinion analysis.
Supported by:
the Shanghai Informatization Development （Big Data Development） Special Fund Project(201901043);the Shanghai Industrial Transformation and Upgrading Special Fund （Industrial Technology Innovation） Project(JJ-YJCX-01-18-3418)

基于BERT的警情文本命名实体识别

王月¹^,², 王孟轩¹^,², 张胜¹^,², 杜渂¹^,²()

^1.迪爱斯信息技术股份有限公司，上海 200032
^2.电信科学技术第一研究所，上海 200032

通讯作者: 杜渂
作者简介:王月（1994—），男，江苏连云港人，硕士研究生，CCF会员，主要研究方向：机器学习、自然语言处理
王孟轩（1993—），男，宁夏银川人，硕士研究生，CCF会员，主要研究方向：文本分类、情感分析
张胜（1994—），男，湖北武汉人，硕士研究生，CCF会员，主要研究方向：深度学习、舆情分析；
基金资助:
上海市信息化发展（大数据发展）专项资金资助项目(201901043);上海市产业转型升级专项资金（产业技术创新）资助项目(JJ-YJCX-01-18-3418)

Abstract

Abstract:

Aiming at the problem that the key entity information in the police field is difficult to recognize， a neural network model based on BERT （Bidirectional Encoder Representations from Transformers）， namely BERT-BiLSTM-Attention-CRF， was proposed to recognize and extract related named entities， in the meantime， the corresponding entity annotation specifications were designed for different cases. In the model ，the BERT pre-trained word vectors were used to replace the word vectors trained by the traditional methods such as Skip-gram and Continuous Bag of Words （CBOW）， improving the representation ability of the word vector and solving the problem of word boundary division in Chinese corpus trained by the character vectors. And the attention mechanism was used to improve the architecture of classical Named Entity Recognition （NER） model BiLSTM-CRF. BERT-BiLSTM-Attention-CRF model has an accuracy of 91% on the test set， which is 7% higher than that of CRF++ Baseline， and 4% higher than that of BiLSTM-CRF model. The F1 values of the entities are all higher than 0.87.

Key words: alarm text, Named Entity Recognition (NER), pretraining language model, annotation specification, word vector

摘要：

针对警情领域关键实体信息难以识别的问题，提出一种基于BERT的神经网络模型BERT-BiLSTM-Attention-CRF用于识别和提取相关命名实体，且针对不同案由设计了相应的实体标记注规范。该模型使用BERT预训练词向量代替传统Skip-gram和CBOW等方式训练的静态词向量，提升了词向量的表证能力，同时解决了中文语料采用字向量训练时词语边界的划分问题；还使用注意力机制改进经典的命名实体识别（NER）模型架构BiLSTM-CRF。BERT-BiLSTM-Attention-CRF模型在测试集上的准确率达91%，较CRF++的基准模型提高7%，也高于BiLSTM-CRF模型86%的准确率，其中相关人名、损失金额、处理方式等实体的F1值均高于0.87。

关键词: 警情文本, 命名实体识别, 预训练语言模型, 标注规范, 词向量

CLC Number:

TP391.1

Yue WANG, Mengxuan WANG, Sheng ZHANG, Wen DU. Alarm text named entity recognition based on BERT[J]. Journal of Computer Applications, 2020, 40(2): 535-540.

王月, 王孟轩, 张胜, 杜渂. 基于BERT的警情文本命名实体识别[J]. 《计算机应用》唯一官方网站, 2020, 40(2): 535-540.

Figures/Tables 9

Fig. 1 BERT-BiLSTM-Attention-CRF model

Fig. 2 BERT pretraining language model

Fig. 3 Word vector composition of BERT pretraining language model

Fig. 4 Structure of feature extractor Transformer

Fig. 5 Schematic diagram of self-attention mechanism

Tab. 1 Experimental environment

操作系统	Ubuntu16.04
CPU	Intel Core TMi7-8700CPU@3.2 GHz $×$ 4
GPU	GTX1080Ti （32 GB）
Python	3.6.3
TensorFlow	1.10
内存	32 GB

Tab. 1 Experimental environment

操作系统	Ubuntu16.04
CPU	Intel Core TMi7-8700CPU@3.2 GHz $×$ 4
GPU	GTX1080Ti （32 GB）
Python	3.6.3
TensorFlow	1.10
内存	32 GB

Fig. 6 Loss curve of validation set and training set

Tab. 2 Results of named entity recognition of each model

模型	精确率	召回率	F1值
CRF++（Baseline）	0.84	0.66	0.74
BiLSTM-Attention-CRF	0.86	0.86	0.86
BiLSTM-CRF	0.83	0.83	0.83
BERT-BiLSTM-Attention-CRF	0.91	0.94	0.92

Tab.3 Recognition rate of different named entities

命名实体	精确率	召回率	F1值	命名实体	精确率	召回率	F1值
案发时间	0.80	0.79	0.79	转账途径	0.77	0.63	0.70
相关地点	0.57	0.53	0.55	处理方式	0.88	0.87	0.87
受害人名	0.89	0.98	0.93	诈骗金额	0.89	0.86	0.87
诈骗手段	0.60	0.50	0.55	转账途径	0.77	0.63	0.70
诈骗金额	0.89	0.86	0.87	处理方式	0.88	0.87	0.87

References 22

1	张晓艳，王挺，陈火旺. 命名实体识别研究［J］. 计算机科学， 2005， 32（4）：44-48. 10.3969/j.issn.1002-137X.2005.04.014
	ZHANG X Y， WANG T， CHEN H W. Research on named entity recognition［J］. Computer Science， 2005， 32（4）：44-48. 10.3969/j.issn.1002-137X.2005.04.014
2	COLLOBERT R， WESTON J， BOTTOU L， et al. Natural language processing almost from scratch［J］. The Journal of Machine Learning Research， 2011， 12： 2493-2573.
3	PETERS M E， AMMAR W， BHAGAVATULA C， et al. Semi-supervised sequence tagging with bidirectional language models［C］// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Stroudsburg， PA： Association for Computational Linguistics， 2017：1756-1765. 10.18653/v1/p17-1161
4	SHAO Y， HARDMEIER C， TIEDEMANN J， et al. Character-based joint segmentation and pos tagging for Chinese using bidirectional RNN-CRF［C］// Proceedings of the 8th International Joint Conference on Natural Language Processing. Stroudsburg， PA： Association for Computational Linguistics， 2017：173-183.
5	REI M， CRICHTON G K O， PYYSALO S. Attending to characters in neural sequence labeling models［C］// Proceedings of the 26th International Joint Conference on Computational Linguistics. Stroudsburg， PA： Association for Computational Linguistics， 2016： 309-318.
6	KNÖBELREITER P， REINBACHER C， SHEKHOVTSOV A， et al. End-to-end training of hybrid CNN-CRF models for stereo［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 1456-1465. 10.1109/cvpr.2017.159
7	黄昌宁，赵海.由字构词——中文分词新方法［C］// 中国中文信息学会二十五周年学术会议论文集.北京：中国中文信息学会，2006：53-56.
	HUANG C N， ZHAO H. Word formation by characters： a new approach to Chinese word segmentation［C］// Proceedings of the 25th Anniversary Academic Conference of Chinese Information Processing Society of China. Beijing： Chinese Information Processing Society of China， 2006： 53-56.
8	ZHANG Y， YANG J. Chinese NER using lattice LSTM［C］// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Stroudsburg， PA： Association for Computational Linguistics， 2018：1554-1564. 10.18653/v1/p18-1144
9	PETERS M E， NEUMANN M， IYYER M， et al. Deep contextualized word representations［C］// Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics. Stroudsburg， PA： Association for Computational Linguistics， 2018： 2227-2237. 10.18653/v1/n18-1202
10	DEVLIN J， CHANG M， LEE K， et al. BERT： pre-training of deep bidirectional transformers for language understanding［C］// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics. Stroudsburg， PA： Association for Computational Linguistics， 2019：4171-4186. 10.18653/v1/n19-1423
11	BENGIO Y， DUCHARME R， VINCENT P， et al. A neural probabilistic language model［J］. The Journal of Machine Learning Research， 2003， 3：1137-1155. 10.1007/3-540-33486-6_6
12	VASWANI A， SHAZEER N， PARMAR N， et al. Attention is all you need［C］// Proceedings of the 31st International Conference on Neural Information Processing Systems. New York： Curran Associates Inc.， 2017：6000-6010. 10.1016/s0262-4079(17)32358-8
13	SZEGEDY C， IOFFE S， VANHOUCKE V， et al. Inception-v4， inception-ResNet and the impact of residual connections on learning［C］// Proceedings of the 31st AAAI Conference on Artificial Intelligence. Palo Alto， CA： AAAI Press， 2016：4278-4284. 10.1109/cvpr.2016.308
14	BA J L， KIROS J R， HINTON G E. Layer normalization［EB/OL］. ［2019-01-10］. .
15	BULSARI A B， SAXÉN H. A recurrent neural network for time-series modelling［M］// ALBRECHT R F， REEVES C R， STEELE N C. Artificial Neural Nets and Genetic Algorithms. Vienna： Springer， 1993：285-291. 10.1007/978-3-7091-7533-0_43
16	GERS F A， SCHMIDHUBER E. LSTM recurrent networks learn simple context-free and context-sensitive languages［J］. IEEE Transactions on Neural Networks， 2001， 12（6）：1333-1340. 10.1109/72.963769
17	GRAVES A， JAITLY N， MOHAMED A R. Hybrid speech recognition with deep bidirectional LSTM［C］// Proceedings of the 2013 IEEE Workshop on Automatic Speech Recognition and Understanding. Piscataway： IEEE， 2013：273-278. 10.1109/asru.2013.6707742
18	GRAVES A. Generating sequences with recurrent neural networks［EB/OL］. ［2019-01-10］. .
19	WOJEK C SCHIELE B. A dynamic conditional random field model for joint labeling of object and scene classes［C］// Proceedings of the 10th European Conference on Computer Vision， LNCS5305. Berlin： Springer， 2008：733-747.
20	VITERBI A J， WOLF J K， ZEHAVI E， et al. A pragmatic approach to trellis-coded modulation［J］. IEEE Communications Magazine， 1989， 27（7）：11-19. 10.1109/35.31452
21	KINGMA D P， BA J L. Adam： a method for stochastic optimization［EB/OL］. ［2019-01-10］..
22	WAHLBECK K， TUUNAINEN A， AHOKAS A， et al. Dropout rates in randomized antipsychotic drug trials［J］. Psychopharmacology， 2001， 155（3）：230-233. 10.1007/s002130100711

[1]	Huanliang SUN, Siyi WANG, Junling LIU, Jingke XU. Help-seeking information extraction model for flood event in social media data [J]. Journal of Computer Applications, 2024, 44(8): 2437-2445.
[2]	Youren YU, Yangsen ZHANG, Yuru JIANG, Gaijuan HUANG. Chinese named entity recognition model incorporating multi-granularity linguistic knowledge and hierarchical information [J]. Journal of Computer Applications, 2024, 44(6): 1706-1712.
[3]	Yongfeng DONG, Jiaming BAI, Liqin WANG, Xu WANG. Chinese named entity recognition combining prior knowledge and glyph features [J]. Journal of Computer Applications, 2024, 44(3): 702-708.
[4]	Xiaoyan ZHANG, Zhengyu DUAN. Cross-lingual zero-resource named entity recognition model based on sentence-level generative adversarial network [J]. Journal of Computer Applications, 2023, 43(8): 2406-2411.
[5]	Jingsheng LEI, Kaijun LA, Shengying YANG, Yi WU. Joint entity and relation extraction based on contextual semantic enhancement [J]. Journal of Computer Applications, 2023, 43(5): 1438-1444.
[6]	Jie HU, Yan HU, Mengchi LIU, Yan ZHANG. Chinese named entity recognition based on knowledge base entity enhanced BERT model [J]. Journal of Computer Applications, 2022, 42(9): 2680-2685.
[7]	Guanyou XU, Weisen FENG. Python named entity recognition model based on transformer [J]. Journal of Computer Applications, 2022, 42(9): 2693-2700.
[8]	Yayao ZUO, Haoyu CHEN, Zhiran CHEN, Jiawei HONG, Kun CHEN. Named entity recognition method combining multiple semantic features [J]. Journal of Computer Applications, 2022, 42(7): 2001-2008.
[9]	Yi ZHANG, Shuangsheng WANG, Bin HE, Peiming YE, Keqiang LI. Named entity recognition method of elementary mathematical text based on BERT [J]. Journal of Computer Applications, 2022, 42(2): 433-439.
[10]	Lanlan ZENG, Yisong WANG, Panfeng CHEN. Named entity recognition based on BERT and joint learning for judgment documents [J]. Journal of Computer Applications, 2022, 42(10): 3011-3017.
[11]	WANG Wei, ZHAO Erping, CUI Zhiyuan, SUN Hao. Disambiguation method of multi-feature fusion based on HowNet sememe and Word2vec word embedding representation [J]. Journal of Computer Applications, 2021, 41(8): 2193-2198.
[12]	WEN Chaodong, ZENG Cheng, REN Junwei, ZHANG Yan. Patent text classification based on ALBERT and bidirectional gated recurrent unit [J]. Journal of Computer Applications, 2021, 41(2): 407-412.
[13]	ZHANG Xinyi, FENG Shimin, DING Enjie. Entity recognition and relation extraction model for coal mine [J]. Journal of Computer Applications, 2020, 40(8): 2182-2188.
[14]	WU Ting, CAO Chunping. Aspect level sentiment classification model with location weight and long-short term memory based on attention-over-attention [J]. Journal of Computer Applications, 2019, 39(8): 2198-2203.
[15]	MENG Zhao, TIAN Shengwei, YU Long, WANG Ruijin. Regional bullying recognition based on joint hierarchical attentional network and independent recurrent neural network [J]. Journal of Computer Applications, 2019, 39(8): 2450-2455.

Alarm text named entity recognition based on BERT

基于BERT的警情文本命名实体识别

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 9

References 22

Related Articles 15

Recommended Articles

Metrics