《计算机应用》唯一官方网站 ›› 2024, Vol. 44 ›› Issue (4): 1080-1085.DOI: 10.11772/j.issn.1001-9081.2023040490

• 人工智能 • 上一篇    

基于BERT的两次注意力机制远程监督关系抽取

袁泉1,2, 陈昌平1,2(), 陈泽1,2, 詹林峰1,2   

  1. 1.重庆邮电大学 通信与信息工程学院,重庆 400065
    2.重庆邮电大学 通信新技术应用研究中心,重庆 400065
  • 收稿日期:2023-05-04 修回日期:2023-07-03 接受日期:2023-07-10 发布日期:2023-12-04 出版日期:2024-04-10
  • 通讯作者: 陈昌平
  • 作者简介:袁泉(1976—),男,湖南邵阳人,正高级工程师,硕士,主要研究方向:大数据、自然语言处理
    陈昌平(1997—),男,重庆人,硕士研究生,主要研究方向:自然语言处理 2501357195@qq.com
    陈泽(1999—),男,湖北武汉人,硕士研究生,主要研究方向:图像处理
    詹林峰(1996—),男,安徽六安人,硕士研究生,主要研究方向:推荐算法。

Twice attention mechanism distantly supervised relation extraction based on BERT

Quan YUAN1,2, Changping CHEN1,2(), Ze CHEN1,2, Linfeng ZHAN1,2   

  1. 1.School of Communication and Information Engineering,Chongqing University of Posts and Telecommunications,Chongqing 400065,China
    2.Research Center of New Communication Technology Applications,Chongqing University of Posts and Telecommunications,Chongqing 400065,China
  • Received:2023-05-04 Revised:2023-07-03 Accepted:2023-07-10 Online:2023-12-04 Published:2024-04-10
  • Contact: Changping CHEN
  • About author:YUAN Quan, born in 1976, M. S., senior engineer. His research interests include big data, natural language processing.
    CHEN Changping, born in 1997, M. S. candidate. His research interests include natural language processing.
    CHEN Ze, born in 1999, M. S. candidate. His research interests include image processing.
    ZHAN Linfeng, born in 1996, M. S. candidate. His research interests include recommendation algorithms.

摘要:

针对词向量语义信息不完整以及文本特征抽取时的一词多义问题,提出基于BERT(Bidirectional Encoder Representation from Transformer)的两次注意力加权算法(TARE)。首先,在词向量编码阶段,通过构建 QKV 矩阵使用自注意力机制动态编码算法,为当前词的词向量捕获文本前后词语义信息;其次,在模型输出句子级特征向量后,利用定位信息符提取全连接层对应参数,构建关系注意力矩阵;最后,运用句子级注意力机制算法为每个句子级特征向量添加不同的注意力分数,提高句子级特征的抗噪能力。实验结果表明:在NYT-10m数据集上,与基于对比学习框架的CIL(Contrastive Instance Learning)算法相比,TARE的F1值提升了4.0个百分点,按置信度降序排列后前100、200和300条数据精准率Precision@N的平均值(P@M)提升了11.3个百分点;在NYT-10d数据集上,与基于注意力机制的PCNN-ATT(Piecewise Convolutional Neural Network algorithm based on ATTention mechanism)算法相比,精准率与召回率曲线下的面积(AUC)提升了4.8个百分点,P@M值提升了2.1个百分点。在主流的远程监督关系抽取(DSER)任务中,TARE有效地提升了模型对数据特征的学习能力。

关键词: 远程监督, 关系抽取, 注意力机制, 词向量特征, 全连接层

Abstract:

Aiming at the problem of incomplete semantic information of word vectors and the problem of word polysemy faced by text feature extraction, a BERT (Bidirectional Encoder Representation from Transformer) word vector-based Twice Attention mechanism weighting algorithm for Relation Extraction (TARE) was proposed. Firstly, in the word embedding stage, the self-attention dynamic encoding algorithm was used to capture the semantic information before and after the text for the current word vector by constructing QK and V matrices. Then, after the model output the sentence-level feature vector, the locator was used to extract the corresponding parameters of the fully connected layer to construct the relation attention matrix. Finally, the sentence level attention mechanism algorithm was used to add different attention scores to sentence-level feature vectors to improve the noise immunity of sentence-level features. The experimental results show that compared with Contrastive Instance Learning (CIL) algorithm for relation extraction, the F1 value is increased by 4.0 percentage points and the average value of Precision@100, Precision@200, and Precision@300 (P@M) is increased by 11.3 percentage points on the NYT-10m dataset. Compared with the Piecewise Convolutional Neural Network algorithm based on ATTention mechanism (PCNN-ATT), the AUC (Area Under precision-recall Curve) value is increased by 4.8 percentage points and the P@M value is increased by 2.1 percentage points on the NYT-10d dataset. In various mainstream Distantly Supervised for Relation Extraction (DSRE) tasks, TARE effectively improves the model’s ability to learn data features.

Key words: distant supervision, relation extraction, attention mechanism, word embedding feature, fully connected layer

中图分类号: