Journal of Computer Applications ›› 2020, Vol. 40 ›› Issue (2): 535-540.DOI: 10.11772/j.issn.1001-9081.2019101717
• CCF Bigdata 2019 • Previous Articles Next Articles
Yue WANG1,2, Mengxuan WANG1,2, Sheng ZHANG1,2, Wen DU1,2()
Received:
2019-08-20
Revised:
2019-10-21
Accepted:
2019-10-24
Online:
2019-10-31
Published:
2020-02-10
Contact:
Wen DU
About author:
WANG Yue, born in 1994, M. S. candidate. His research interests include machine learning, natural language processing.Supported by:
通讯作者:
杜渂
作者简介:
王月(1994—),男,江苏连云港人,硕士研究生,CCF会员,主要研究方向:机器学习、自然语言处理基金资助:
CLC Number:
Yue WANG, Mengxuan WANG, Sheng ZHANG, Wen DU. Alarm text named entity recognition based on BERT[J]. Journal of Computer Applications, 2020, 40(2): 535-540.
王月, 王孟轩, 张胜, 杜渂. 基于BERT的警情文本命名实体识别[J]. 《计算机应用》唯一官方网站, 2020, 40(2): 535-540.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2019101717
操作系统 | Ubuntu16.04 |
---|---|
CPU | Intel Core TMi7-8700CPU@3.2 GHz |
GPU | GTX1080Ti (32 GB) |
Python | 3.6.3 |
TensorFlow | 1.10 |
内存 | 32 GB |
Tab. 1 Experimental environment
操作系统 | Ubuntu16.04 |
---|---|
CPU | Intel Core TMi7-8700CPU@3.2 GHz |
GPU | GTX1080Ti (32 GB) |
Python | 3.6.3 |
TensorFlow | 1.10 |
内存 | 32 GB |
模型 | 精确率 | 召回率 | F1值 |
---|---|---|---|
CRF++(Baseline) | 0.84 | 0.66 | 0.74 |
BiLSTM-Attention-CRF | 0.86 | 0.86 | 0.86 |
BiLSTM-CRF | 0.83 | 0.83 | 0.83 |
BERT-BiLSTM-Attention-CRF | 0.91 | 0.94 | 0.92 |
Tab. 2 Results of named entity recognition of each model
模型 | 精确率 | 召回率 | F1值 |
---|---|---|---|
CRF++(Baseline) | 0.84 | 0.66 | 0.74 |
BiLSTM-Attention-CRF | 0.86 | 0.86 | 0.86 |
BiLSTM-CRF | 0.83 | 0.83 | 0.83 |
BERT-BiLSTM-Attention-CRF | 0.91 | 0.94 | 0.92 |
命名实体 | 精确率 | 召回率 | F1值 | 命名实体 | 精确率 | 召回率 | F1值 |
---|---|---|---|---|---|---|---|
案发时间 | 0.80 | 0.79 | 0.79 | 转账途径 | 0.77 | 0.63 | 0.70 |
相关地点 | 0.57 | 0.53 | 0.55 | 处理方式 | 0.88 | 0.87 | 0.87 |
受害人名 | 0.89 | 0.98 | 0.93 | 诈骗金额 | 0.89 | 0.86 | 0.87 |
诈骗手段 | 0.60 | 0.50 | 0.55 | 转账途径 | 0.77 | 0.63 | 0.70 |
诈骗金额 | 0.89 | 0.86 | 0.87 | 处理方式 | 0.88 | 0.87 | 0.87 |
Tab.3 Recognition rate of different named entities
命名实体 | 精确率 | 召回率 | F1值 | 命名实体 | 精确率 | 召回率 | F1值 |
---|---|---|---|---|---|---|---|
案发时间 | 0.80 | 0.79 | 0.79 | 转账途径 | 0.77 | 0.63 | 0.70 |
相关地点 | 0.57 | 0.53 | 0.55 | 处理方式 | 0.88 | 0.87 | 0.87 |
受害人名 | 0.89 | 0.98 | 0.93 | 诈骗金额 | 0.89 | 0.86 | 0.87 |
诈骗手段 | 0.60 | 0.50 | 0.55 | 转账途径 | 0.77 | 0.63 | 0.70 |
诈骗金额 | 0.89 | 0.86 | 0.87 | 处理方式 | 0.88 | 0.87 | 0.87 |
1 | 张晓艳,王挺,陈火旺. 命名实体识别研究[J]. 计算机科学, 2005, 32(4):44-48. 10.3969/j.issn.1002-137X.2005.04.014 |
ZHANG X Y, WANG T, CHEN H W. Research on named entity recognition[J]. Computer Science, 2005, 32(4):44-48. 10.3969/j.issn.1002-137X.2005.04.014 | |
2 | COLLOBERT R, WESTON J, BOTTOU L, et al. Natural language processing almost from scratch[J]. The Journal of Machine Learning Research, 2011, 12: 2493-2573. |
3 | PETERS M E, AMMAR W, BHAGAVATULA C, et al. Semi-supervised sequence tagging with bidirectional language models[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: Association for Computational Linguistics, 2017:1756-1765. 10.18653/v1/p17-1161 |
4 | SHAO Y, HARDMEIER C, TIEDEMANN J, et al. Character-based joint segmentation and pos tagging for Chinese using bidirectional RNN-CRF[C]// Proceedings of the 8th International Joint Conference on Natural Language Processing. Stroudsburg, PA: Association for Computational Linguistics, 2017:173-183. |
5 | REI M, CRICHTON G K O, PYYSALO S. Attending to characters in neural sequence labeling models[C]// Proceedings of the 26th International Joint Conference on Computational Linguistics. Stroudsburg, PA: Association for Computational Linguistics, 2016: 309-318. |
6 | KNÖBELREITER P, REINBACHER C, SHEKHOVTSOV A, et al. End-to-end training of hybrid CNN-CRF models for stereo[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 1456-1465. 10.1109/cvpr.2017.159 |
7 | 黄昌宁,赵海.由字构词——中文分词新方法[C]// 中国中文信息学会二十五周年学术会议论文集.北京:中国中文信息学会,2006:53-56. |
HUANG C N, ZHAO H. Word formation by characters: a new approach to Chinese word segmentation[C]// Proceedings of the 25th Anniversary Academic Conference of Chinese Information Processing Society of China. Beijing: Chinese Information Processing Society of China, 2006: 53-56. | |
8 | ZHANG Y, YANG J. Chinese NER using lattice LSTM[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: Association for Computational Linguistics, 2018:1554-1564. 10.18653/v1/p18-1144 |
9 | PETERS M E, NEUMANN M, IYYER M, et al. Deep contextualized word representations[C]// Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics. Stroudsburg, PA: Association for Computational Linguistics, 2018: 2227-2237. 10.18653/v1/n18-1202 |
10 | DEVLIN J, CHANG M, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics. Stroudsburg, PA: Association for Computational Linguistics, 2019:4171-4186. 10.18653/v1/n19-1423 |
11 | BENGIO Y, DUCHARME R, VINCENT P, et al. A neural probabilistic language model[J]. The Journal of Machine Learning Research, 2003, 3:1137-1155. 10.1007/3-540-33486-6_6 |
12 | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. New York: Curran Associates Inc., 2017:6000-6010. 10.1016/s0262-4079(17)32358-8 |
13 | SZEGEDY C, IOFFE S, VANHOUCKE V, et al. Inception-v4, inception-ResNet and the impact of residual connections on learning[C]// Proceedings of the 31st AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2016:4278-4284. 10.1109/cvpr.2016.308 |
14 | BA J L, KIROS J R, HINTON G E. Layer normalization[EB/OL]. [2019-01-10]. . |
15 | BULSARI A B, SAXÉN H. A recurrent neural network for time-series modelling[M]// ALBRECHT R F, REEVES C R, STEELE N C. Artificial Neural Nets and Genetic Algorithms. Vienna: Springer, 1993:285-291. 10.1007/978-3-7091-7533-0_43 |
16 | GERS F A, SCHMIDHUBER E. LSTM recurrent networks learn simple context-free and context-sensitive languages[J]. IEEE Transactions on Neural Networks, 2001, 12(6):1333-1340. 10.1109/72.963769 |
17 | GRAVES A, JAITLY N, MOHAMED A R. Hybrid speech recognition with deep bidirectional LSTM[C]// Proceedings of the 2013 IEEE Workshop on Automatic Speech Recognition and Understanding. Piscataway: IEEE, 2013:273-278. 10.1109/asru.2013.6707742 |
18 | GRAVES A. Generating sequences with recurrent neural networks[EB/OL]. [2019-01-10]. . |
19 | WOJEK C SCHIELE B. A dynamic conditional random field model for joint labeling of object and scene classes[C]// Proceedings of the 10th European Conference on Computer Vision, LNCS5305. Berlin: Springer, 2008:733-747. |
20 | VITERBI A J, WOLF J K, ZEHAVI E, et al. A pragmatic approach to trellis-coded modulation[J]. IEEE Communications Magazine, 1989, 27(7):11-19. 10.1109/35.31452 |
21 | KINGMA D P, BA J L. Adam: a method for stochastic optimization[EB/OL]. [2019-01-10].. |
22 | WAHLBECK K, TUUNAINEN A, AHOKAS A, et al. Dropout rates in randomized antipsychotic drug trials[J]. Psychopharmacology, 2001, 155(3):230-233. 10.1007/s002130100711 |
[1] | Huanliang SUN, Siyi WANG, Junling LIU, Jingke XU. Help-seeking information extraction model for flood event in social media data [J]. Journal of Computer Applications, 2024, 44(8): 2437-2445. |
[2] | Youren YU, Yangsen ZHANG, Yuru JIANG, Gaijuan HUANG. Chinese named entity recognition model incorporating multi-granularity linguistic knowledge and hierarchical information [J]. Journal of Computer Applications, 2024, 44(6): 1706-1712. |
[3] | Yongfeng DONG, Jiaming BAI, Liqin WANG, Xu WANG. Chinese named entity recognition combining prior knowledge and glyph features [J]. Journal of Computer Applications, 2024, 44(3): 702-708. |
[4] | Xiaoyan ZHANG, Zhengyu DUAN. Cross-lingual zero-resource named entity recognition model based on sentence-level generative adversarial network [J]. Journal of Computer Applications, 2023, 43(8): 2406-2411. |
[5] | Jingsheng LEI, Kaijun LA, Shengying YANG, Yi WU. Joint entity and relation extraction based on contextual semantic enhancement [J]. Journal of Computer Applications, 2023, 43(5): 1438-1444. |
[6] | Jie HU, Yan HU, Mengchi LIU, Yan ZHANG. Chinese named entity recognition based on knowledge base entity enhanced BERT model [J]. Journal of Computer Applications, 2022, 42(9): 2680-2685. |
[7] | Guanyou XU, Weisen FENG. Python named entity recognition model based on transformer [J]. Journal of Computer Applications, 2022, 42(9): 2693-2700. |
[8] | Yayao ZUO, Haoyu CHEN, Zhiran CHEN, Jiawei HONG, Kun CHEN. Named entity recognition method combining multiple semantic features [J]. Journal of Computer Applications, 2022, 42(7): 2001-2008. |
[9] | Yi ZHANG, Shuangsheng WANG, Bin HE, Peiming YE, Keqiang LI. Named entity recognition method of elementary mathematical text based on BERT [J]. Journal of Computer Applications, 2022, 42(2): 433-439. |
[10] | Lanlan ZENG, Yisong WANG, Panfeng CHEN. Named entity recognition based on BERT and joint learning for judgment documents [J]. Journal of Computer Applications, 2022, 42(10): 3011-3017. |
[11] | WANG Wei, ZHAO Erping, CUI Zhiyuan, SUN Hao. Disambiguation method of multi-feature fusion based on HowNet sememe and Word2vec word embedding representation [J]. Journal of Computer Applications, 2021, 41(8): 2193-2198. |
[12] | WEN Chaodong, ZENG Cheng, REN Junwei, ZHANG Yan. Patent text classification based on ALBERT and bidirectional gated recurrent unit [J]. Journal of Computer Applications, 2021, 41(2): 407-412. |
[13] | ZHANG Xinyi, FENG Shimin, DING Enjie. Entity recognition and relation extraction model for coal mine [J]. Journal of Computer Applications, 2020, 40(8): 2182-2188. |
[14] | WU Ting, CAO Chunping. Aspect level sentiment classification model with location weight and long-short term memory based on attention-over-attention [J]. Journal of Computer Applications, 2019, 39(8): 2198-2203. |
[15] | MENG Zhao, TIAN Shengwei, YU Long, WANG Ruijin. Regional bullying recognition based on joint hierarchical attentional network and independent recurrent neural network [J]. Journal of Computer Applications, 2019, 39(8): 2450-2455. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||