《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (2): 433-439.DOI: 10.11772/j.issn.1001-9081.2021020334

• 人工智能 • 上一篇    

基于BERT的初等数学文本命名实体识别方法

张毅, 王爽胜(), 何彬, 叶培明, 李克强   

  1. 重庆邮电大学 通信与信息工程学院,重庆 400065
  • 收稿日期:2021-03-08 修回日期:2021-04-29 接受日期:2021-04-30 发布日期:2021-05-10 出版日期:2022-02-10
  • 通讯作者: 王爽胜
  • 作者简介:张毅(1970—),男,重庆人,教授,硕士,主要研究方向:教育信息化、深度学习、机器学习;
    王爽胜(1995—),男,湖北天门人,硕士研究生,主要研究方向:自然语言处理、深度学习;
    何彬(1996—),男,广西桂林人,硕士研究生,主要研究方向:深度学习;
    叶培明(1995—),男,重庆人,硕士研究生,主要研究方向:教育信息化、机器学习;
    李克强(1995—),男,河南平顶山人,硕士研究生,主要研究方向:深度学习。
  • 基金资助:
    国家自然科学基金资助项目(6170011898);重庆市自然科学基金资助项目(cstc2018jcyjA0743);重庆市教委科技研究计划项目(KJQN201800640)

Named entity recognition method of elementary mathematical text based on BERT

Yi ZHANG, Shuangsheng WANG(), Bin HE, Peiming YE, Keqiang LI   

  1. School of Communications and Information Engineering,Chongqing University of Posts and Telecommunications,Chongqing 400065,China
  • Received:2021-03-08 Revised:2021-04-29 Accepted:2021-04-30 Online:2021-05-10 Published:2022-02-10
  • Contact: Shuangsheng WANG
  • About author:ZHANG Yi, born in 1970, M. S., professor. His research interests include educational informatization, deep learning, machine learning.
    WANG Shuangsheng, born in 1995, M. S. candidate. His research interests include natural language processing, deep learning.
    HE Bin, born in 1996, M. S. candidate. His research interests include deep learning.
    YE Peiming, born in 1995, M. S. candidate. His research interests include educational informatization, machine learning.
    LI Keqiang, born in 1995, M. S. candidate. His research interests include deep learning.
  • Supported by:
    National Natural Science Foundation of China(6170011898);Chongqing Natural Science Foundation(cstc2018jcyjA0743);Science and Technology Research Program of Chongqing Municipal Education Commission(KJQN201800640)

摘要:

在初等数学领域的命名实体识别(NER)中,针对传统命名实体识别方法中词嵌入无法表征一词多义以及特征提取过程中忽略部分局部特征的问题,提出一种基于BERT的初等数学文本命名实体识别方法——BERT-BiLSTM-IDCNN-CRF。首先,采用BERT进行预训练,然后将训练得到的词向量输入到双向长短期记忆(BiLSTM)网络与迭代膨胀卷积网络(IDCNN)中提取特征,再将两种神经网络输出的特征进行合并,最后经过条件随机场(CRF)修正后进行输出。实验结果表明:BERT-BiLSTM-IDCNN-CRF在初等数学试题数据集上的F1值为93.91%,相较于BiLSTM-CRF基准方法的F1值提升了4.29个百分点,相较于BERT-BiLSTM-CRF方法的F1值提高了1.23个百分点;该方法对线、角、面、数列等实体识别的F1值均高于91%,验证了该方法对初等数学实体识别的有效性。此外,在所提方法的基础上结合注意力机制后,该方法的召回率下降了0.67个百分点,但准确率上升了0.75个百分点,注意力机制的引入对所提方法的识别效果提升不大。

关键词: 命名实体识别, 初等数学, BERT, 双向长短期记忆网络, 膨胀卷积, 注意力机制

Abstract:

In Named Entity Recognition (NER) of elementary mathematics, aiming at the problems that the word embedding of the traditional NER method cannot represent the polysemy of a word and some local features are ignored in the feature extraction process of the method, a Bidirectional Encoder Representation from Transformers (BERT) based NER method for elementary mathematical text named BERT-BiLSTM-IDCNN-CRF (BERT-Bidirectional Long Short-Term Memory-Iterated Dilated Convolutional Neural Network-Conditional Random Field) was proposed. Firstly, BERT was used for pre-training. Then, the word vectors obtained by training were input into BiLSTM and IDCNN to extract features, after that, the output features of the two neural networks were merged. Finally, the output was obtained through the correction of CRF. Experimental results show that the F1 score of BERT-BiLSTM-IDCNN-CRF is 93.91% on the dataset of test questions of elementary mathematics, which is 4.29 percentage points higher than that of BiLSTM-CRF benchmark model, and 1.23 percentage points higher than that of BERT-BiLSTM-CRF model. And the F1 scores of the proposed method to line, angle, plane, sequence and other entities are all higher than 91%, which verifies the effectiveness of the proposed method on elementary mathematical entity recognition. In addition, after adding attention mechanism to the proposed model, the recall of the model decreases by 0.67 percentage points, but the accuracy of the model increases by 0.75 percentage points, which means the introduction of attention mechanism has little effect on the recognition effect of the proposed method.

Key words: Named Entity Recognition (NER), elementary mathematics, Bidirectional Encoder Representation from Transformers (BERT), Bidirectional Long Short-Term Memory (BiLSTM) network, dilated convolution, attention mechanism

中图分类号: