基于BERT的初等数学文本命名实体识别方法

doi:10.11772/j.issn.1001-9081.2021020334

《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (2): 433-439.DOI: 10.11772/j.issn.1001-9081.2021020334

所属专题：人工智能

基于BERT的初等数学文本命名实体识别方法

张毅, 王爽胜(), 何彬, 叶培明, 李克强

重庆邮电大学通信与信息工程学院，重庆 400065

收稿日期:2021-03-08 修回日期:2021-04-29 接受日期:2021-04-30 发布日期:2022-02-11 出版日期:2022-02-10
通讯作者: 王爽胜
作者简介:张毅（1970—），男，重庆人，教授，硕士，主要研究方向：教育信息化、深度学习、机器学习；
王爽胜（1995—），男，湖北天门人，硕士研究生，主要研究方向：自然语言处理、深度学习；
何彬（1996—），男，广西桂林人，硕士研究生，主要研究方向：深度学习；
叶培明（1995—），男，重庆人，硕士研究生，主要研究方向：教育信息化、机器学习；
李克强（1995—），男，河南平顶山人，硕士研究生，主要研究方向：深度学习。
基金资助:
国家自然科学基金资助项目(6170011898);重庆市自然科学基金资助项目(cstc2018jcyjA0743);重庆市教委科技研究计划项目(KJQN201800640)

Named entity recognition method of elementary mathematical text based on BERT

Yi ZHANG, Shuangsheng WANG(), Bin HE, Peiming YE, Keqiang LI

School of Communications and Information Engineering，Chongqing University of Posts and Telecommunications，Chongqing 400065，China

Received:2021-03-08 Revised:2021-04-29 Accepted:2021-04-30 Online:2022-02-11 Published:2022-02-10
Contact: Shuangsheng WANG
About author:ZHANG Yi， born in 1970， M. S.， professor. His research interests include educational informatization， deep learning， machine learning.
WANG Shuangsheng， born in 1995， M. S. candidate. His research interests include natural language processing， deep learning.
HE Bin， born in 1996， M. S. candidate. His research interests include deep learning.
YE Peiming， born in 1995， M. S. candidate. His research interests include educational informatization， machine learning.
LI Keqiang， born in 1995， M. S. candidate. His research interests include deep learning.
Supported by:
National Natural Science Foundation of China(6170011898);Chongqing Natural Science Foundation(cstc2018jcyjA0743);Science and Technology Research Program of Chongqing Municipal Education Commission(KJQN201800640)

摘要/Abstract

摘要：

在初等数学领域的命名实体识别（NER）中，针对传统命名实体识别方法中词嵌入无法表征一词多义以及特征提取过程中忽略部分局部特征的问题，提出一种基于BERT的初等数学文本命名实体识别方法——BERT-BiLSTM-IDCNN-CRF。首先，采用BERT进行预训练，然后将训练得到的词向量输入到双向长短期记忆（BiLSTM）网络与迭代膨胀卷积网络（IDCNN）中提取特征，再将两种神经网络输出的特征进行合并，最后经过条件随机场（CRF）修正后进行输出。实验结果表明：BERT-BiLSTM-IDCNN-CRF在初等数学试题数据集上的F1值为93.91%，相较于BiLSTM-CRF基准方法的F1值提升了4.29个百分点，相较于BERT-BiLSTM-CRF方法的F1值提高了1.23个百分点；该方法对线、角、面、数列等实体识别的F1值均高于91%，验证了该方法对初等数学实体识别的有效性。此外，在所提方法的基础上结合注意力机制后，该方法的召回率下降了0.67个百分点，但准确率上升了0.75个百分点，注意力机制的引入对所提方法的识别效果提升不大。

关键词: 命名实体识别, 初等数学, BERT, 双向长短期记忆网络, 膨胀卷积, 注意力机制

Abstract:

In Named Entity Recognition （NER） of elementary mathematics， aiming at the problems that the word embedding of the traditional NER method cannot represent the polysemy of a word and some local features are ignored in the feature extraction process of the method， a Bidirectional Encoder Representation from Transformers （BERT） based NER method for elementary mathematical text named BERT-BiLSTM-IDCNN-CRF （BERT-Bidirectional Long Short-Term Memory-Iterated Dilated Convolutional Neural Network-Conditional Random Field） was proposed. Firstly， BERT was used for pre-training. Then， the word vectors obtained by training were input into BiLSTM and IDCNN to extract features， after that， the output features of the two neural networks were merged. Finally， the output was obtained through the correction of CRF. Experimental results show that the F1 score of BERT-BiLSTM-IDCNN-CRF is 93.91% on the dataset of test questions of elementary mathematics， which is 4.29 percentage points higher than that of BiLSTM-CRF benchmark model， and 1.23 percentage points higher than that of BERT-BiLSTM-CRF model. And the F1 scores of the proposed method to line， angle， plane， sequence and other entities are all higher than 91%， which verifies the effectiveness of the proposed method on elementary mathematical entity recognition. In addition， after adding attention mechanism to the proposed model， the recall of the model decreases by 0.67 percentage points， but the accuracy of the model increases by 0.75 percentage points， which means the introduction of attention mechanism has little effect on the recognition effect of the proposed method.

Key words: Named Entity Recognition (NER), elementary mathematics, Bidirectional Encoder Representation from Transformers (BERT), Bidirectional Long Short-Term Memory (BiLSTM) network, dilated convolution, attention mechanism

中图分类号:

TP391.1

张毅, 王爽胜, 何彬, 叶培明, 李克强. 基于BERT的初等数学文本命名实体识别方法[J]. 计算机应用, 2022, 42(2): 433-439.

Yi ZHANG, Shuangsheng WANG, Bin HE, Peiming YE, Keqiang LI. Named entity recognition method of elementary mathematical text based on BERT[J]. Journal of Computer Applications, 2022, 42(2): 433-439.

图/表 10

图1 融合Attention的BERT-BiLSTM-IDCNN-CRF神经网络模型的整体结构

Fig. 1 Overall structure of BERT-BiLSTM-IDCNN-CRF neural network model with attention mechanism

图2 BERT结构

Fig. 2 Structure of BERT

图3 Transformer编码结构

Fig. 3 Encoding structure of Transformer

图4 膨胀卷积示意图

Fig. 4 Schematic diagram of dilated convolution

表1 部分实体表述与示例

Tab. 1 Representation and examples of some entities

实体类别	标注符号	实体描述	示例
角	Angle	角、二面角	∠ABC
三角形	Triangle	直角三角形、锐角三角形	三角形BCD
数列	sequence	数列、等差数列、等比数列	数列 ${a n}$
点	Point	圆点、动点	点M（2，3）
向量	Vector	向量、单位向量	向量 $m n$

表1 部分实体表述与示例

Tab. 1 Representation and examples of some entities

实体类别	标注符号	实体描述	示例
角	Angle	角、二面角	∠ABC
三角形	Triangle	直角三角形、锐角三角形	三角形BCD
数列	sequence	数列、等差数列、等比数列	数列 ${a n}$
点	Point	圆点、动点	点M（2，3）
向量	Vector	向量、单位向量	向量 $m n$

表2 实验环境

Tab. 2 Experimental environment

环境	配置
操作系统	Windows 10
CPU	Intel Core i7-8700k @3.7 GHz
GPU	NVIDIA GeForce RTX 2080Ti（11 GB）
Python	3.7.1
TensorFlow	1.13.1
内存	64 GB

表3 模型参数

Tab. 3 Model parameters

参数	值	参数	值
Transformer层数	12	Batch_size	16
隐藏层维度	768	attention_size	128
优化器	Adam	dropout	0.5
学习率	0.000 05	clip	5.0
Lstm_dim	128

表4 不同模型的命名实体识别结果对比 ( %)

Tab. 4 Comparison of named entity recognition results of different models

模型	准确率	召回率	F1值
CRF	67.06	47.61	55.68
IDCNN-CRF	86.38	87.62	87.00
BiLSTM-CRF	89.96	89.28	89.62
BiLSTM- Attention -CRF	89.08	91.04	90.05
BiLSTM-IDCNN-CRF	91.37	90.19	90.78
BERT-BiLSTM-CRF	91.07	94.34	92.68
BERT-BiLSTM-IDCNN-CRF	92.89	94.95	93.91
BERT-BiLSTM-IDCNN-Attention-CRF	93.64	94.28	93.96

图5 三种模型各个实体上的F1值

Fig. 5 F1 score of three models on each entity

表5 BERT-BiLSTM-IDCNN-CRF对各个实体的识别结果 ( %)

Tab. 5 Recognition results of BERT-BiLSTM-IDCNN-CRF to each entity

实体	P	R	F1
角	97.75	96.02	96.88
圆	66.67	66.67	66.67
锥体	100.00	100.00	100.00
方程	56.25	75.00	64.29
函数	94.12	96.55	95.32
线	89.41	94.41	91.84
点	81.65	95.70	88.12
四边形	100.00	100.00	100.00
数列	96.33	95.45	95.89
集合	97.62	100.00	98.80
面	100.00	88.89	94.12
三角形	100.00	100.00	100.00
向量	94.74	91.76	93.23

参考文献 25

1	乐娟，赵玺.基于HMM的京剧机构命名实体识别算法［J］.计算机工程， 2013， 39（6）： 266-271， 286. 10.3969/j.issn.1000-3428.2013.06.059
	LE J， ZHAO X. Algorithm of Beijing Opera organization names entity recognition based on HMM［J］. Computer Engineering， 2013， 39（6）： 266-271， 286. 10.3969/j.issn.1000-3428.2013.06.059
2	程健一，关毅，何彬.基于SVM和CRF双层分类器的英文电子病历去隐私化［J］.智能计算机与应用， 2016， 6（6）： 17-19， 24. 10.3969/j.issn.2095-2163.2016.06.005
	CHENG J Y， GUAN Y， HE B. De-identification on electronic medical records using a two tier classifier based on SVM and CRF［J］. Intelligent Computer and Application， 2016， 6（6）： 17-19， 24. 10.3969/j.issn.2095-2163.2016.06.005
3	何彬，关毅.基于字级别条件随机场的医学实体识别［J］.智能计算机与应用， 2019， 9（2）： 130-134， 142. 10.3969/j.issn.2095-2163.2019.02.030
	HE B， GUAN Y. Character-based CRF for medical entity recognition［J］. Intelligent Computer and Application， 2019， 9（2）： 130-134， 142. 10.3969/j.issn.2095-2163.2019.02.030
4	HAMMERTON J. Named entity recognition with long short-term memory［C］// Proceedings of the 7th Conference on Natural Language Learning at HLT-NAACL. Stroudsburg， PA： Association for Computational Linguistics， 2003： 172-175. 10.3115/1119176.1119202
5	LAMPLE G， BALLESTEROS M， SUBRAMANIAN S， et al. Neural architectures for named entity recognition［C］// Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics： Human Language Technologies. Stroudsburg， PA： Association for Computational Linguistics， 2016： 260-270. 10.18653/v1/n16-1030
6	TANG P， YANG P L， SHI Y， et al. Recognizing Chinese judicial named entity using BiLSTM-CRF［J］. Journal of Physics： Conference Series， 2020， 1592： No.012040. 10.1088/1742-6596/1592/1/012040
7	肖瑞，胡冯菊，裴卫.基于BiLSTM-CRF的中医文本命名实体识别［J］.世界科学技术：中医药现代化， 2020， 22（7）： 2504-2510.
	XIAO R， HU F J， PEI W. Chinese medicine text named entity recognition based on BiLSTM-CRF［J］. Modernization of Traditional Chinese Medicine — World Science and Technology， 2020， 22（7）： 2504-2510.
8	李丽双，郭元凯.基于CNN-BLSTM-CRF模型的生物医学命名实体识别［J］.中文信息学报， 2018， 32（1）： 116-122. 10.3969/j.issn.1003-0077.2018.01.015
	LI L S， GUO Y K. Biomedical named entity recognition with CNN-BLSTM-CRF［J］. Journal of Chinese Information Processing， 2018， 32（1）： 116-122. 10.3969/j.issn.1003-0077.2018.01.015
9	LUO L， YANG Z H， YANG P， et al. An attention-based BiLSTM-CRF approach to document-level chemical named entity recognition［J］. Bioinformatics， 2018， 34（8）： 1381-1388. 10.1093/bioinformatics/btx761
10	张晗，郭渊博，李涛.结合GAN与BiLSTM-Attention-CRF的领域命名实体识别［J］.计算机研究与发展， 2019， 56（9）： 1851-1858. 10.7544/issn1000-1239.2019.20180733
	ZHANG H， GUO Y B， LI T. Domain named entity recognition combining GAN and BiLSTM-Attention-CRF［J］. Journal of Computer Research and Development， 2019， 56（9）： 1851-1858. 10.7544/issn1000-1239.2019.20180733
11	STRUBELL E， VERGA P， BELANGER D， et al. Fast and accurate entity recognition with iterated dilated convolutions［C］// Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Stroudsburg， PA： Association for Computational Linguistics， 2017： 2670-2680. 10.18653/v1/d17-1283
12	吕江海，杜军平，周南，等.基于膨胀卷积迭代与注意力机制的实体名识别方法［J］.计算机工程， 2021， 47（1）： 58-65， 71. 10.19678/j.issn.1000-3428.0055986
	LYU J H， DU J P， ZHOU N， et al. Entity name recognition method based on dilated convolution iterative and attention mechanism［J］. Computer Engineering， 2021， 47（1）： 58-65， 71. 10.19678/j.issn.1000-3428.0055986
13	ZHANG Y， YANG J. Chinese NER using lattice LSTM［C］// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics （Volume 1： Long Papers）. Stroudsburg， PA： Association for Computational Linguistics， 2018： 1554-1564. 10.18653/v1/p18-1144
14	PETERS M E， NEUMANN M， IYYER M， et al. Deep contextualized word representations［C］// Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics， Volume 1（Long Papers）. Stroudsburg， PA： Association for Computational Linguistics， 2018： 2227-2237. 10.18653/v1/n18-1202
15	RADFORD A， NARASIMHAN K， SALIMANS T， et al. Improving language understanding with unsupervised learning［EB/OL］. ［2021-03-01］. .
16	DEVLIN J， CHANG M W， LEE K， et al. BERT： pre-training of deep bidirectional transformers for language understanding［C］// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics： Human Language Technologies， Volume 1（Long and Short Papers）. Stroudsburg， PA： Association for Computational Linguistics， 2019： 4171-4186. 10.18653/v1/n19-1423
17	SOUZA F， NOGUEIRA R， LOTUFO R. Portuguese named entity recognition using BERT-CRF［EB/OL］. （2020-02-27）［2021-03-01］. .
18	张秋颖，傅洛伊，王新兵.基于BERT-BiLSTM-CRF的学者主页信息抽取［J］.计算机应用研究， 2020， 37（S1）： 47-49. 10.1007/978-981-16-1843-7_14
	ZHANG Q Y， FU L Y， WANG X B. Information extraction from scholar homepage based on BERT-BiLSTM-CRF［J］. Application Research of Computers， 2020， 37（S1）： 47-49. 10.1007/978-981-16-1843-7_14
19	王月，王孟轩，张胜，等.基于BERT的警情文本命名实体识别［J］.计算机应用， 2020， 40（2）： 535-540. 10.11772/j.issn.1001-9081.2019101717
	WANG Y， WANG M X， ZHANG S， et al. Alarm text named entity recognition based on BERT［J］. Journal of Computer Applications， 2020， 40（2）： 535-540. 10.11772/j.issn.1001-9081.2019101717
20	VASWANI A， SHAZEER N， PARMAR N， et al. Attention is all you need［C］// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2017： 6000-6010. 10.1016/s0262-4079(17)32358-8
21	SZEGEDY C， IOFFE S， VANHOUCKE V， et al. Inception-v4， inception-ResNet and the impact of residual connections on learning［C］// Proceedings of the 31st AAAI Conference on Artificial Intelligence. Palo Alto， CA： AAAI Press， 2017： 4278-4284.
22	BA J L， KIROS J R， HINTON G E. Layer normalization［EB/OL］. （2016-07-21）［2021-03-08］. .
23	谢腾，杨俊安，刘辉.基于BERT-BiLSTM-CRF模型的中文实体识别［J］.计算机系统应用，2020，29（07）：48-55.
	XIE T， YANG J A， LIU H. Chinese entity recognition based on BERT-BiLSTM-CRF model ［J］. Computer Systems & Applications， 2020， 29（7）： 48-55.
24	YU F， KOLTUN V. Multi-scale context aggregation by dilated convolutions［EB/OL］. （2016-04-30）［2021-03-08］. . 10.4236/psych.2020.1110096
25	杨文明，褚伟杰.在线医疗问答文本的命名实体识别［J］.计算机系统应用， 2019， 28（2）： 8-14.
	YANG W M， CHU W J. Named entity recognition of online medical questions answering text［J］. Computer Systems & Applications， 2019， 28（2）： 8-14.

[1]	秦璟, 秦志光, 李发礼, 彭悦恒. 基于概率稀疏自注意力神经网络的重性抑郁疾患诊断[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2970-2974.
[2]	李力铤, 华蓓, 贺若舟, 徐况. 基于解耦注意力机制的多变量时序预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2732-2738.
[3]	赵志强, 马培红, 黑新宏. 基于双重注意力机制的人群计数方法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2886-2892.
[4]	薛凯鹏, 徐涛, 廖春节. 融合自监督和多层交叉注意力的多模态情感分析网络[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2387-2392.
[5]	汪雨晴, 朱广丽, 段文杰, 李书羽, 周若彤. 基于交互注意力机制的心理咨询文本情感分类模型[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2393-2399.
[6]	高鹏淇, 黄鹤鸣, 樊永红. 融合坐标与多头注意力机制的交互语音情感识别[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2400-2406.
[7]	李钟华, 白云起, 王雪津, 黄雷雷, 林初俊, 廖诗宇. 基于图像增强的低照度人脸检测[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2588-2594.
[8]	莫尚斌, 王文君, 董凌, 高盛祥, 余正涛. 基于多路信息聚合协同解码的单通道语音增强[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2611-2617.
[9]	孙焕良, 王思懿, 刘俊岭, 许景科. 社交媒体数据中水灾事件求助信息提取模型[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2437-2445.
[10]	熊武, 曹从军, 宋雪芳, 邵云龙, 王旭升. 基于多尺度混合域注意力机制的笔迹鉴别方法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2225-2232.
[11]	李欢欢, 黄添强, 丁雪梅, 罗海峰, 黄丽清. 基于多尺度时空图卷积网络的交通出行需求预测[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2065-2072.
[12]	毛典辉, 李学博, 刘峻岭, 张登辉, 颜文婧. 基于并行异构图和序列注意力机制的中文实体关系抽取模型[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2018-2025.
[13]	刘丽, 侯海金, 王安红, 张涛. 基于多尺度注意力的生成式信息隐藏算法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2102-2109.
[14]	徐松, 张文博, 王一帆. 基于时空信息的轻量视频显著性目标检测网络[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2192-2199.
[15]	李大海, 王忠华, 王振东. 结合空间域和频域信息的双分支低光照图像增强网络[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2175-2182.

基于BERT的初等数学文本命名实体识别方法

Named entity recognition method of elementary mathematical text based on BERT

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 10

参考文献 25

相关文章 15

编辑推荐

Metrics