基于BERT的两次注意力机制远程监督关系抽取

doi:10.11772/j.issn.1001-9081.2023040490

《计算机应用》唯一官方网站 ›› 2024, Vol. 44 ›› Issue (4): 1080-1085.DOI: 10.11772/j.issn.1001-9081.2023040490

• 人工智能 • 上一篇

基于BERT的两次注意力机制远程监督关系抽取

袁泉¹^,², 陈昌平¹^,²(), 陈泽¹^,², 詹林峰¹^,²

^1.重庆邮电大学通信与信息工程学院，重庆 400065
^2.重庆邮电大学通信新技术应用研究中心，重庆 400065

收稿日期:2023-05-04 修回日期:2023-07-03 接受日期:2023-07-10 发布日期:2023-12-04 出版日期:2024-04-10
通讯作者: 陈昌平
作者简介:袁泉（1976—），男，湖南邵阳人，正高级工程师，硕士，主要研究方向：大数据、自然语言处理
陈昌平（1997—），男，重庆人，硕士研究生，主要研究方向：自然语言处理 2501357195@qq.com
陈泽（1999—），男，湖北武汉人，硕士研究生，主要研究方向：图像处理
詹林峰（1996—），男，安徽六安人，硕士研究生，主要研究方向：推荐算法。

Twice attention mechanism distantly supervised relation extraction based on BERT

Quan YUAN¹^,², Changping CHEN¹^,²(), Ze CHEN¹^,², Linfeng ZHAN¹^,²

^1.School of Communication and Information Engineering，Chongqing University of Posts and Telecommunications，Chongqing 400065，China
^2.Research Center of New Communication Technology Applications，Chongqing University of Posts and Telecommunications，Chongqing 400065，China

Received:2023-05-04 Revised:2023-07-03 Accepted:2023-07-10 Online:2023-12-04 Published:2024-04-10
Contact: Changping CHEN
About author:YUAN Quan， born in 1976， M. S.， senior engineer. His research interests include big data， natural language processing.
CHEN Changping， born in 1997， M. S. candidate. His research interests include natural language processing.
CHEN Ze， born in 1999， M. S. candidate. His research interests include image processing.
ZHAN Linfeng， born in 1996， M. S. candidate. His research interests include recommendation algorithms.

摘要/Abstract

摘要：

针对词向量语义信息不完整以及文本特征抽取时的一词多义问题，提出基于BERT（Bidirectional Encoder Representation from Transformer）的两次注意力加权算法（TARE）。首先，在词向量编码阶段，通过构建 Q 、 K 、 V 矩阵使用自注意力机制动态编码算法，为当前词的词向量捕获文本前后词语义信息；其次，在模型输出句子级特征向量后，利用定位信息符提取全连接层对应参数，构建关系注意力矩阵；最后，运用句子级注意力机制算法为每个句子级特征向量添加不同的注意力分数，提高句子级特征的抗噪能力。实验结果表明：在NYT-10m数据集上，与基于对比学习框架的CIL（Contrastive Instance Learning）算法相比，TARE的F1值提升了4.0个百分点，按置信度降序排列后前100、200和300条数据精准率Precision@N的平均值（P@M）提升了11.3个百分点；在NYT-10d数据集上，与基于注意力机制的PCNN-ATT（Piecewise Convolutional Neural Network algorithm based on ATTention mechanism）算法相比，精准率与召回率曲线下的面积（AUC）提升了4.8个百分点，P@M值提升了2.1个百分点。在主流的远程监督关系抽取（DSER）任务中，TARE有效地提升了模型对数据特征的学习能力。

关键词: 远程监督, 关系抽取, 注意力机制, 词向量特征, 全连接层

Abstract:

Aiming at the problem of incomplete semantic information of word vectors and the problem of word polysemy faced by text feature extraction， a BERT （Bidirectional Encoder Representation from Transformer） word vector-based Twice Attention mechanism weighting algorithm for Relation Extraction （TARE） was proposed. Firstly， in the word embedding stage， the self-attention dynamic encoding algorithm was used to capture the semantic information before and after the text for the current word vector by constructing Q， K and V matrices. Then， after the model output the sentence-level feature vector， the locator was used to extract the corresponding parameters of the fully connected layer to construct the relation attention matrix. Finally， the sentence level attention mechanism algorithm was used to add different attention scores to sentence-level feature vectors to improve the noise immunity of sentence-level features. The experimental results show that compared with Contrastive Instance Learning （CIL） algorithm for relation extraction， the F1 value is increased by 4.0 percentage points and the average value of Precision@100， Precision@200， and Precision@300 （P@M） is increased by 11.3 percentage points on the NYT-10m dataset. Compared with the Piecewise Convolutional Neural Network algorithm based on ATTention mechanism （PCNN-ATT）， the AUC （Area Under precision-recall Curve） value is increased by 4.8 percentage points and the P@M value is increased by 2.1 percentage points on the NYT-10d dataset. In various mainstream Distantly Supervised for Relation Extraction （DSRE） tasks， TARE effectively improves the model’s ability to learn data features.

Key words: distant supervision, relation extraction, attention mechanism, word embedding feature, fully connected layer

中图分类号:

TP391.1

袁泉, 陈昌平, 陈泽, 詹林峰. 基于BERT的两次注意力机制远程监督关系抽取[J]. 计算机应用, 2024, 44(4): 1080-1085.

Quan YUAN, Changping CHEN, Ze CHEN, Linfeng ZHAN. Twice attention mechanism distantly supervised relation extraction based on BERT[J]. Journal of Computer Applications, 2024, 44(4): 1080-1085.

图/表 6

参考文献 25

1	ALT C， GABRYSZAK A， HENNIG L. Probing linguistic features of sentence-level representations in neural relation extraction［C］// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg： ACL， 2020：1534-1545. 10.18653/v1/2020.acl-main.140
2	TANG Y， HUANG J， WANG G， et al. Orthogonal relation transforms with graph context modeling for knowledge graph embedding［C］// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg： ACL， 2020：2713-2722. 10.18653/v1/2020.acl-main.241
3	WANG C， JIANG H. Explicit utilization of general knowledge in machine reading comprehension［C］// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg： ACL， 2019：2263-2272. 10.18653/v1/p19-1219
4	SAXENA A， TRIPATHI A， TALUKDAR P. Improving multi-hop question answering over knowledge graphs using knowledge base embeddings［C］// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg： ACL， 2020：4498-4507. 10.18653/v1/2020.acl-main.412
5	WEI Z， SU J， WANG Y， et al. A novel cascade binary tagging framework for relational triple extraction［C］// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg： ACL， 2020：1476-1488. 10.18653/v1/2020.acl-main.136
6	ALT C， HÜBER M， HENNIG L. Fine-tuning pre-trained transformer language models to distantly supervised relation extraction［C］// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg： ACL， 2019：1388-1398. 10.18653/v1/p19-1134
7	VEYSEH A P B， DERNONCOURT F， DOU D J， et al. Exploiting the syntax-model consistency for neural relation extraction［C］// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg： ACL， 2020：8021-8032. 10.18653/v1/2020.acl-main.715
8	MINTZ M， BILLS S， SNOW R， et al. Distant supervision for relation extraction without labeled data［C］// Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP. Stroudsburg： ACL， 2009：1003-1011. 10.3115/1690219.1690287
9	ZENG D， LIU K， LAI S， et al. Relation classification via convolutional deep neural network［C］// Proceedings of COLING 2014， the 25th International Conference on Computational Linguistics： Technical Papers. Stroudsburg： ACL， 2014：2335-2344.
10	QIN P， XU W， WANG W Y. DSGAN： generative adversarial training for distant supervision relation extraction［C］// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics （Volume 1： Long Papers）. Stroudsburg： ACL， 2018：496-505. 10.18653/v1/p18-1046
11	HOFFMANN R， ZHANG C， LING X， et al. Knowledge-based weak supervision for information extraction of overlapping relations［C］// Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics： Human Language Technologies， Stroudsburg： ACL， 2011：541-550.
12	RITTER A， ZETTLEMOYER L， MAUSAM， et al. Modeling missing data in distant supervision for information extraction［J］. Transactions of the Association for Computational Linguistics， 2013， 1：367-378. 10.1162/tacl_a_00234
13	SURDEANU M， TIBSHIRANI J， NALLAPATI R， et al. Multi-instance multi-label learning for relation extraction［C］// Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Stroudsburg： ACL， 2012：455-465.
14	ZENG D， LIU K， CHEN Y， et al. Distant supervision for relation extraction via piecewise convolutional neural networks［C］// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Stroudsburg： ACL， 2015：1753-1762. 10.18653/v1/d15-1203
15	VASHISHTH S， JOSHI R， PRAYAGA S S， et al. RESIDE： improving distantly-supervised neural relation extraction using side information［C］// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Stroudsburg： ACL， 2018：1257-1266. 10.18653/v1/d18-1157
16	CHEN T， SHI H， TANG S， et al. CIL： contrastive instance learning framework for distantly supervised relation extraction［C］// Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. Stroudsburg： ACL， 2021：6191-6200. 10.18653/v1/2021.acl-long.483
17	王佳宇，李楹，马春梅，等. 融合实体信息的图卷积神经网络的短文本分类模型［J］.天津师范大学学报（自然科学版），2023，43（1）：67-72.
	WANG J Y， LI Y， MA C M， et al. Short text classification based on graph convolutional neural networks with entity information［J］. Journal Tianjin Normal University （Natural Science Edition）， 2023，43（1）：67-72.
18	唐焕玲，卫红敏，王育林，等. 结合LDA与Word2vec的文本语义增强方法［J］.计算机工程与应用，2022，58（13）：135-145. 10.3778/j.issn.1002-8331.2112-0491
	TANG H L， WEI H M， WANG Y L， et al. Text semantic enhancement method combining LDA and Word2vec［J］. Computer Engineering and Applications， 2022，58（13）：135-145. 10.3778/j.issn.1002-8331.2112-0491
19	ZHOU P， SHI W， TIAN J， et al. Attention-based bidirectional long short-term memory networks for relation classification［C］// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics （Volume 2： Short Papers）. Stroudsburg： ACL， 2016：207-212. 10.18653/v1/p16-2034
20	YE Z-X， LING Z-H. Distant supervision relation extraction with intra-bag and inter-bag attentions［C］// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics： Human Language Technologies， Volume 1 （Long and Short Papers）. Stroudsburg： ACL， 2019：2810-2819. 10.18653/v1/n19-1288
21	LI D， ZHANG T， HU N， et al. HiCLRE： a hierarchical contrastive learning framework for distantly supervised relation extraction［C］// Findings of the Association for Computational Linguistics： ACL 2022. Stroudsburg： ACL， 2022：2567-2578. 10.18653/v1/2022.findings-acl.202
22	RATHORE V， BADOLA K， SINGLA P， et al. PARE： a simple and strong baseline for monolingual and multilingual distantly supervised relation extraction［C］// Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics （Volume 2： Short Papers）. Stroudsburg： ACL， 2022：340-354. 10.18653/v1/2022.acl-short.38
23	RIEDEL S， YAO L， McCALLUM A. Modeling relations and their mentions without labeled text［C］// Proceedings of the 2010 Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Cham：Springer，2010：148-163. 10.1007/978-3-642-15939-8_10
24	BHARTIYA A， BADOLA K， MAUSAM. DiS-ReX： a multilingual dataset for distantly supervised relation extraction［C］// Proceeding of the 60th Annual Meeting of the Association for Computational Linguistics（Volume 2： Short Papers）. Stroudsburg： ACL， 2022：849-863. 10.18653/v1/2022.acl-short.95
25	LIN Y， SHEN S， LIU Z， et al. Neural relation extraction with selective attention over instances［C］// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics （Volume 1： Long Papers）. Stroudsburg： ACL， 2016：2124-2133. 10.18653/v1/p16-1200

数据集	关系种类数	样本总数	测试集样本总数	测试集
NYT-10d	58	694 000	172 000	Distant sup
NYT-10m	25	474 000	9 740	Manual

数据集	关系种类数	样本总数	测试集样本总数	测试集
NYT-10d	58	694 000	172 000	Distant sup
NYT-10m	25	474 000	9 740	Manual

参数名	符号	参数值
词向量维度	Embedding_dim	768
学习率	Lr	10^-5，2×10^-5
句子最大长度	Max_length	512
批处理数	Batch_size	16，32，64
Dropout	Dropout	0.5

参数名	符号	参数值
词向量维度	Embedding_dim	768
学习率	Lr	10^-5，2×10^-5
句子最大长度	Max_length	512
批处理数	Batch_size	16，32，64
Dropout	Dropout	0.5

方法	AUC	P@M
文献［8］方法	10.7	49.2
PCNN-ATT	34.1	69.4
TARE	38.9	71.5

基于BERT的两次注意力机制远程监督关系抽取

Twice attention mechanism distantly supervised relation extraction based on BERT

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 6

参考文献 25

相关文章 15

编辑推荐

Metrics

方法	AUC	F1	P@M
PCNN-ATT	41.9	32.0	68.6
DISTRE	35.7	31.4	65.1
CIL	56.0	34.3	75.9
TARE	54.1	38.3	87.2

模型	NYT-10m			NYT-10d
模型	AUC	F1	P@M	AUC	P@M
TARE	54.1	38.4	87.3	38.9	71.5
No Sentence-attention	53.0	35.3	86.2	37.3	70.2
No self-attention	51.3	32.4	85.3	34.7	69.4

[1]	肖斌, 甘昀, 汪敏, 张兴鹏, 王照星. 基于端口注意力与通道空间注意力的网络异常流量检测[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1027-1034.
[2]	杨先凤, 汤依磊, 李自强. 基于交替注意力机制和图卷积网络的方面级情感分析模型[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1058-1064.
[3]	王海涵, 朱焱. 融合反讽机制的攻击性言论检测[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1065-1071.
[4]	江锐, 刘威, 陈成, 卢涛. 非对称端到端的无监督图像去雨网络[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 922-930.
[5]	尚爱国, 朱欣娟. 基于多任务学习的意图检测和槽位填充联合方法[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 690-695.
[6]	郑宇亮, 陈云华, 白伟杰, 陈平华. 融合事件数据和图像帧的车辆目标检测[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 931-937.
[7]	赵奎, 仇慧琪, 李旭, 徐知非. 结合注意力和多路径融合的实时肺结节检测算法[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 945-952.
[8]	黄子杰, 欧阳, 江德港, 郭彩玲, 李柏林. 面向牵引座焊缝表面质量检测的轻量型深度学习算法[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 983-988.
[9]	董永峰, 白佳明, 王利琴, 王旭. 融合先验知识和字形特征的中文命名实体识别[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 702-708.
[10]	孙滔, 段张甜, 朱浩楠, 郭沛豪, 孙鹤立. 基于新奇度量的社交事件推荐方法[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 760-766.
[11]	郭安迪, 贾真, 李天瑞. 基于伪实体数据增强的高精准率医学领域实体关系抽取[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 393-402.
[12]	党伟超, 张磊, 高改梅, 刘春霞. 融合片段对比学习的弱监督动作定位方法[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 548-555.
[13]	黄子麒, 胡建鹏. 实体类别增强的汽车领域嵌套命名实体识别[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 377-384.
[14]	罗歆然, 李天瑞, 贾真. 基于自注意力机制与词汇增强的中文医学命名实体识别[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 385-392.
[15]	邓辅秦, 官桧锋, 谭朝恩, 付兰慧, 王宏民, 林天麟, 张建民. 基于请求与应答通信机制和局部注意力机制的多机器人强化学习路径规划方法[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 432-438.