《计算机应用》唯一官方网站 ›› 2025, Vol. 45 ›› Issue (6): 1801-1808.DOI: 10.11772/j.issn.1001-9081.2024060776
• 人工智能 • 上一篇
收稿日期:
2024-06-12
修回日期:
2024-08-08
接受日期:
2024-08-16
发布日期:
2024-09-10
出版日期:
2025-06-10
通讯作者:
宋威
作者简介:
杨大伟(1995—),男,山东泰安人,硕士研究生,CCF会员,主要研究方向:自然语言处理、关系抽取基金资助:
Dawei YANG1, Xihai XU2, Wei SONG1,3()
Received:
2024-06-12
Revised:
2024-08-08
Accepted:
2024-08-16
Online:
2024-09-10
Published:
2025-06-10
Contact:
Wei SONG
About author:
YANG Dawei, born in 1995, M. S. candidate. His research interests include nature language processing, relation extraction.Supported by:
摘要:
针对文本特征提取时缺乏考虑句子的上下文判别性特征以及未能充分利用实例和关系标签之间的关联信息的问题,提出一种结合语义增强和感知注意力的关系抽取方法(SPRE)。首先,在句子特征编码阶段,构建语义增强机制(SEM)提取句子的显著性语义特征,通过实体感知词嵌入和显著特征感知(SFP)得到显著信息增强的句子表示;其次,设计感知注意力机制(PAM)整合句子特征,通过感知句子与关系标签之间的语义信息、句子的实体类型与对应关系的实体类型之间的一致性信息,以及句子之间的相似性信息评估句子与关系标签的匹配程度,充分利用包中实例与关系标签的依赖关系,进一步提高方法的降噪能力;最后,利用分类器进行关系预测并根据预测结果与实际结果的交叉熵调整网络参数。在NYT-10(New York Times 10)和GDS(Google Distant Supervision)数据集上的实验结果表明,在NYT-10数据集上,与基于BERT(Bidirectional Encoder Representations from Transformers)的关系抽取方法PARE(Passage-Attended Relation Extraction)相比,所提方法在曲线下面积(AUC)上提升了2.1个百分点,在按置信度降序排列后前100、200 和300条数据精确率Precision@N(P@N)的平均值P@M上提升了2.4个百分点;在GDS数据集上,所提方法的AUC和P@M分别达到了90.5%和97.8%。所提方法在上述2个数据集上均明显优于主流的远程监督关系抽取方法,验证了该方法的有效性。可见,在主流的远程监督关系抽取任务中,所提方法能有效地提升模型对数据特征的学习能力。
中图分类号:
杨大伟, 徐西海, 宋威. 结合语义增强和感知注意力的关系抽取方法[J]. 计算机应用, 2025, 45(6): 1801-1808.
Dawei YANG, Xihai XU, Wei SONG. Relation extraction method combining semantic enhancement and perception attention[J]. Journal of Computer Applications, 2025, 45(6): 1801-1808.
标签 | 句子 | 显著性信息 | 对错 |
---|---|---|---|
founder | S1:Bill Gates is the principal founder of Microsoft. | founder | True |
S2: Bill Gates founded Microsoft in 1975. | founded | True | |
S3:Bill Gates speaking at a Microsoft held …… | — | False |
表1 RE示例
Tab. 1 Examples of RE
标签 | 句子 | 显著性信息 | 对错 |
---|---|---|---|
founder | S1:Bill Gates is the principal founder of Microsoft. | founder | True |
S2: Bill Gates founded Microsoft in 1975. | founded | True | |
S3:Bill Gates speaking at a Microsoft held …… | — | False |
数据集 | 训练集 | 测试集 | ||
---|---|---|---|---|
实体对数 | 示例数 | 实体对数 | 示例数 | |
NYT-10 | 293 003 | 570 088 | 96 678 | 172 448 |
GDS | 6 498 | 11 297 | 3 247 | 5 663 |
表2 NYT-10和GDS数据集的统计情况
Tab. 2 Statistics of NYT-10 and GDS datasets
数据集 | 训练集 | 测试集 | ||
---|---|---|---|---|
实体对数 | 示例数 | 实体对数 | 示例数 | |
NYT-10 | 293 003 | 570 088 | 96 678 | 172 448 |
GDS | 6 498 | 11 297 | 3 247 | 5 663 |
超参数 | 值 | |
---|---|---|
NYT-10数据集 | GDS数据集 | |
词嵌入维度 | 50 | 50 |
卷积核数k | 230 | 230 |
位置嵌入维度 | 5 | 5 |
卷积核大小 | 3 | 3 |
超参数λ | 20 | 17 |
GCN输入层数 | 100 | 150 |
GCN的隐藏层数 | 750 | 900 |
GCN的输出层数 | 1 250 | 150 |
分类器输入层数 | 690 | 300 |
丢失率dropout | 0.5 | 0.5 |
超参数 | 0.1 | 0.1 |
学习率 | 0.5 | 0.5 |
batch size | 160 | 160 |
表3 2个数据集的超参数设置
Tab. 3 Hyperparameter setting of two datasets
超参数 | 值 | |
---|---|---|
NYT-10数据集 | GDS数据集 | |
词嵌入维度 | 50 | 50 |
卷积核数k | 230 | 230 |
位置嵌入维度 | 5 | 5 |
卷积核大小 | 3 | 3 |
超参数λ | 20 | 17 |
GCN输入层数 | 100 | 150 |
GCN的隐藏层数 | 750 | 900 |
GCN的输出层数 | 1 250 | 150 |
分类器输入层数 | 690 | 300 |
丢失率dropout | 0.5 | 0.5 |
超参数 | 0.1 | 0.1 |
学习率 | 0.5 | 0.5 |
batch size | 160 | 160 |
方法 | NYT-10 | GDS | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
P@100 | P@200 | P@300 | P@M | AUC | P@100 | P@200 | P@300 | P@M | AUC | |
PCNN+ATT | 72.9 | 71.5 | 69.6 | 71.3 | 38.4 | 96.4 | 93.3 | 91.5 | 93.7 | 79.9 |
PCNN+ATT+ENT | 83.0 | 80.0 | 74.7 | 79.2 | 44.8 | 93.9 | 93.7 | 93.5 | 93.7 | 84.9 |
MUTICAST | 83.7 | 79.2 | 74.2 | 79.0 | 40.2 | — | — | — | — | — |
FAN | 85.8 | 83.4 | 79.9 | 83.0 | 44.8 | — | — | — | — | — |
DSRE‑VAE | 84.0 | 77.0 | 75.3 | 78.8 | 43.5 | 96.9 | 96.7 | 96.3 | 96.6 | 87.6 |
CIL | 81.5 | 75.5 | 72.1 | 76.4 | 42.1 | 97.0 | 96.5 | 96.5 | 96.6 | 90.2 |
HiCLRE | 82.0 | 78.5 | 74.0 | 78.2 | 45.3 | — | — | — | — | — |
CGRE | 88.9 | 86.4 | 81.8 | 85.7 | 47.4 | 98.0 | 96.7 | 96.5 | 97.0 | 90.3 |
PARE | 90.0 | 84.0 | 82.3 | 85.4 | 47.5 | 98.5 | 97.5 | 97.0 | 97.7 | 90.4 |
SPRE | 92.0 | 88.0 | 83.3 | 87.8 | 49.6 | 98.5 | 98.0 | 97.0 | 97.8 | 90.5 |
表4 SPRE与对比方法在NYT-10和GDS数据集上的P@N(N=100,200,300)、P@M和AUC (%)
Tab. 4 P@N (N=100, 200, 300), P@M and AUC of SPRE and comparison methods on NYT-10 and GDS datasets
方法 | NYT-10 | GDS | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
P@100 | P@200 | P@300 | P@M | AUC | P@100 | P@200 | P@300 | P@M | AUC | |
PCNN+ATT | 72.9 | 71.5 | 69.6 | 71.3 | 38.4 | 96.4 | 93.3 | 91.5 | 93.7 | 79.9 |
PCNN+ATT+ENT | 83.0 | 80.0 | 74.7 | 79.2 | 44.8 | 93.9 | 93.7 | 93.5 | 93.7 | 84.9 |
MUTICAST | 83.7 | 79.2 | 74.2 | 79.0 | 40.2 | — | — | — | — | — |
FAN | 85.8 | 83.4 | 79.9 | 83.0 | 44.8 | — | — | — | — | — |
DSRE‑VAE | 84.0 | 77.0 | 75.3 | 78.8 | 43.5 | 96.9 | 96.7 | 96.3 | 96.6 | 87.6 |
CIL | 81.5 | 75.5 | 72.1 | 76.4 | 42.1 | 97.0 | 96.5 | 96.5 | 96.6 | 90.2 |
HiCLRE | 82.0 | 78.5 | 74.0 | 78.2 | 45.3 | — | — | — | — | — |
CGRE | 88.9 | 86.4 | 81.8 | 85.7 | 47.4 | 98.0 | 96.7 | 96.5 | 97.0 | 90.3 |
PARE | 90.0 | 84.0 | 82.3 | 85.4 | 47.5 | 98.5 | 97.5 | 97.0 | 97.7 | 90.4 |
SPRE | 92.0 | 88.0 | 83.3 | 87.8 | 49.6 | 98.5 | 98.0 | 97.0 | 97.8 | 90.5 |
方法 | NYT-10 | GDS | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
P@100 | P@200 | P@300 | P@M | AUC | P@100 | P@200 | P@300 | P@M | AUC | |
SPRE w/o PAM | 90.0 | 84.0 | 78.7 | 84.2 | 45.7 | 98.0 | 95.0 | 95.3 | 96.1 | 88.3 |
SPRE w/o SFP | 89.0 | 85.0 | 80.7 | 84.9 | 48.2 | 98.0 | 97.0 | 93.7 | 96.2 | 89.6 |
SPRE w/o all | 83.0 | 80.0 | 74.7 | 79.2 | 44.8 | 93.9 | 93.7 | 93.5 | 94.2 | 84.9 |
SPRE | 92.0 | 88.0 | 83.3 | 87.8 | 49.6 | 98.5 | 98.0 | 97.0 | 97.8 | 90.5 |
表5 SPRE与消融方法在NYT-10和GDS数据集上的P@N(N=100,200,300)、P@M和AUC (%)
Tab. 5 of P@N (N=100, 200, 300), P@M and AUC of SPRE and ablation methods on NYT-10 and GDS datasets
方法 | NYT-10 | GDS | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
P@100 | P@200 | P@300 | P@M | AUC | P@100 | P@200 | P@300 | P@M | AUC | |
SPRE w/o PAM | 90.0 | 84.0 | 78.7 | 84.2 | 45.7 | 98.0 | 95.0 | 95.3 | 96.1 | 88.3 |
SPRE w/o SFP | 89.0 | 85.0 | 80.7 | 84.9 | 48.2 | 98.0 | 97.0 | 93.7 | 96.2 | 89.6 |
SPRE w/o all | 83.0 | 80.0 | 74.7 | 79.2 | 44.8 | 93.9 | 93.7 | 93.5 | 94.2 | 84.9 |
SPRE | 92.0 | 88.0 | 83.3 | 87.8 | 49.6 | 98.5 | 98.0 | 97.0 | 97.8 | 90.5 |
P@100 | P@200 | P@300 | P@M | AUC | |
---|---|---|---|---|---|
0.00 | 90.0 | 85.5 | 84.3 | 86.6 | 47.7 |
0.05 | 91.0 | 86.5 | 85.3 | 87.6 | 48.1 |
0.10 | 92.0 | 88.0 | 83.3 | 87.8 | 49.6 |
0.15 | 90.0 | 90.5 | 85.0 | 88.5 | 48.7 |
0.20 | 87.0 | 81.0 | 81.3 | 83.1 | 47.8 |
表6 不同γ值的SPRE在NYT-10数据集上的P@N和AUC (%)
Tab. 6 P@N and AUC of SPRE under different γ values on NYT-10 dataset
P@100 | P@200 | P@300 | P@M | AUC | |
---|---|---|---|---|---|
0.00 | 90.0 | 85.5 | 84.3 | 86.6 | 47.7 |
0.05 | 91.0 | 86.5 | 85.3 | 87.6 | 48.1 |
0.10 | 92.0 | 88.0 | 83.3 | 87.8 | 49.6 |
0.15 | 90.0 | 90.5 | 85.0 | 88.5 | 48.7 |
0.20 | 87.0 | 81.0 | 81.3 | 83.1 | 47.8 |
1 | KEJRIWAL M, SEQUEDA J, LOPZ V. Knowledge graphs: construction management and querying [J]. Semantic Web, 2019, 10(6): 961-962. |
2 | SUN Y, TANG D, DUAN N, et al. Joint learning of question answering and question generation[J]. IEEE Transactions on Knowledge and Data Engineering, 2020, 32(5): 971-982. |
3 | ALLAHYARI M, POURIYEH S, ASSEFI M, et al. Text summarization techniques: a brief survey [J]. International Journal of Advanced Computer Science and Applications, 2017, 8(10): 397-405. |
4 | 郭喜跃,何婷婷. 信息抽取研究综述[J]. 计算机科学, 2015, 42(2): 14-17. |
GUO X Y, HE T T. Survey about research on information extraction[J]. Computer Science, 2015, 42(2): 14-17. | |
5 | MINTZ M, BILLS S, SNOW R, et al. Distant supervision for relation extraction without labeled data[C]// Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP. Stroudsburg: ACL, 2009: 1003-1011. |
6 | BOLLACKER K, EVANS C, PARITOSH P, et al. Freebase: a collaboratively created graph database for structuring human knowledge[C]// Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data. New York: ACM, 2008: 1247-1250. |
7 | RIEDEL S, YAO L, McCALLUM A K. Modeling relations and their mentions without labeled text[C]// Proceedings of the 2010 European Conference on Machine Learning and Knowledge Discovery in Databases, LNCS 6323. Berlin: Springer, 2010: 148-163. |
8 | ZENG D, LIU K, CHEN Y, et al. Distant supervision for relation extraction via piecewise convolutional neural networks[C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2015: 1753-1762. |
9 | LIN Y, SHEN S, LIU Z, et al. Neural relation extraction with selective attention over instances[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg: ACL, 2016: 2124-2133. |
10 | VASWANI A, SHAZEER N, PARMAR N. Attention is all you need[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2017: 6000-6010. |
11 | XING R, LUO J. Distant supervised relation extraction with separate head-tail CNN [C]// Proceedings of the 5th Workshop on Noisy User-generated Text. Stroudsburg: ACL, 2019: 249-258. |
12 | VASHISHTH S, JOSHI R, PRAYAGA S S, et al. Reside: improving distantly-supervised neural relation extraction using side information [C]// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2018: 1257-1266. |
13 | JAT S, KHANDELWAL S, TALUKAR P. Improving distantly supervised relation extraction using word and entity based attention[EB/OL]. [2024-05-13].. |
14 | 鄂海红,张文静,肖思琪,等. 深度学习实体关系抽取研究综述[J]. 软件学报, 2019, 30(6): 1793-1818. |
E H H, ZHANG W J, XIAO S Q, et al. Survey of entity relationship extraction based on deep learning [J]. Journal of Software, 2019, 30(6): 1793-1818. | |
15 | 江南大学. 基于去噪卷积神经网络的远程监督实体关系抽取方法: 201911306495.X[P]. 2020-04-28. |
Jiangnan University. Remote supervision entity relation extraction method based on denoising convolutional neural networks: 201911306495.X[P]. 2020-04-28. | |
16 | 江南大学. 基于改进深度残差网络和注意力机制的实体关系抽取方法: 201910880164.0[P]. 2019-12-27. |
Jiangnan University. Entity relation extraction method based on improved deep residual networks and attention mechanism: 201910880164.0 [P]. 2019-12-27. | |
17 | LIU T, WANG K, CHANG B, et al. A soft-label method for noise tolerant distantly supervised relation extraction[C]// Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2017: 1790-1795. |
18 | QIN P, XU W, WANG W Y. DSGAN: generative adversarial training for distant supervision relation extraction[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg: ACL, 2018: 496-505. |
19 | FENG J, HUANG M, ZHAO L, et al. Reinforcement learning for relation classification from noisy data [C]// Proceedings of the 32nd AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2018: 5779-5786. |
20 | LIANG T M, LIU Y, ZHANG H, et al. Distantly-supervised long-tailed relation extraction using constraint graphs[J]. IEEE Transactions on Knowledge and Data Engineering, 2023, 35(7): 6852-6865. |
21 | SONG W, GU W, ZHU F, et al. Interaction-and-response network for distantly supervised relation extraction[J]. IEEE Transactions on Neural Networks and Learning Systems, 2024, 35(7): 9523-9537. |
22 | MIKOLOV T, CHEN K, CORRADO G, et al. Efficient estimation of word representations in vector space[EB/OL]. [2024-05-10].. |
23 | LI Y, LONG G, SHEN T, et al. Self-attention enhanced selective gate with entity-aware embedding for distantly supervised relation extraction[C]// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2020: 8269-8276. |
24 | 宋威,朱富鑫. 基于全局和局部特征感知网络的关系提取方法[J]. 中文信息学报, 2020, 34(11): 96-103. |
SONG W, ZHU F X. Global and local feature-aware network for relation extraction[J]. Journal of Chinese Information Processing, 2020, 34(11): 96-103. | |
25 | SRIVASTAVA N, HINTON G, KRIZHEVSKY A, et al. Dropout: a simple way to prevent neural networks from overfitting[J]. Journal of Machine Learning Research, 2014, 15: 1929-1958. |
26 | CHEN T, SHI H, LIU L, et al. Empower distantly supervised relation extraction with collaborative adversarial training[C]// Proceedings of the 35th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2021: 12675-12682. |
27 | HAO K, YU B, HU W. Knowing false negatives: an adversarial training method for distantly supervised relation extraction[C]// Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2021: 9661-9672. |
28 | CHRISTOPOULOU F, MIWA M, ANANIADOU S. Distantly supervised relation extraction with sentence reconstruction and knowledgebase priors [C]// Proceedings of 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg: ACL, 2021: 11-26. |
29 | CHEN T, SHI H, TANG S, et al. CIL: contrastive instance learning framework for distantly supervised relation extraction [C]// Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Stroudsburg: ACL, 2021: 6191-6200. |
30 | LI D, ZHANG T, HU N, et al. HiCLRE: a hierarchical contrastive learning framework for distantly supervised relation extraction[C]// Findings of the Association for Computational Linguistics: ACL 2022. Stroudsburg: ACL, 2022:2567-2578. |
31 | RATHORE V, BADOLA K, SINGLA P, et al. PARE: a simple and strong baseline for monolingual and multilingual distantly supervised relation extraction [C]// Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Stroudsburg: ACL, 2022:340-354. |
[1] | 胡婕, 吴翠, 孙军, 张龑. 基于回指与逻辑推理的文档级关系抽取模型[J]. 《计算机应用》唯一官方网站, 2025, 45(5): 1496-1503. |
[2] | 上官宏, 任慧莹, 张雄, 韩兴隆, 桂志国, 王燕玲. 基于双编码器双解码器GAN的低剂量CT降噪模型[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 624-632. |
[3] | 赵宇博, 张丽萍, 闫盛, 侯敏, 高茂. 基于改进分段卷积神经网络和知识蒸馏的学科知识实体间关系抽取[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2421-2429. |
[4] | 唐媛, 陈艳平, 扈应, 黄瑞章, 秦永彬. 基于多尺度混合注意力卷积神经网络的关系抽取模型[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2011-2017. |
[5] | 毛典辉, 李学博, 刘峻岭, 张登辉, 颜文婧. 基于并行异构图和序列注意力机制的中文实体关系抽取模型[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2018-2025. |
[6] | 魏超, 陈艳平, 王凯, 秦永彬, 黄瑞章. 基于掩码提示与门控记忆网络校准的关系抽取方法[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1713-1719. |
[7] | 高龙涛, 李娜娜. 基于方面感知注意力增强的方面情感三元组抽取[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1049-1057. |
[8] | 袁泉, 陈昌平, 陈泽, 詹林峰. 基于BERT的两次注意力机制远程监督关系抽取[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1080-1085. |
[9] | 郭安迪, 贾真, 李天瑞. 基于伪实体数据增强的高精准率医学领域实体关系抽取[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 393-402. |
[10] | 颜新月, 杨淑群, 高永彬. 基于证据增强与多特征融合的文档级关系抽取[J]. 《计算机应用》唯一官方网站, 2024, 44(11): 3379-3385. |
[11] | 邓金科, 段文杰, 张顺香, 汪雨晴, 李书羽, 李嘉伟. 基于提示增强与双图注意力网络的复杂因果关系抽取[J]. 《计算机应用》唯一官方网站, 2024, 44(10): 3081-3089. |
[12] | 陈克正, 郭晓然, 钟勇, 李振平. 基于负训练和迁移学习的关系抽取方法[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2426-2430. |
[13] | 马胜位, 黄瑞章, 任丽娜, 林川. 基于多层语义融合的结构化深度文本聚类模型[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2364-2369. |
[14] | 黄梦林, 段磊, 张袁昊, 王培妍, 李仁昊. 基于Prompt学习的无监督关系抽取模型[J]. 《计算机应用》唯一官方网站, 2023, 43(7): 2010-2016. |
[15] | 程顺航, 李志华, 魏涛. 融合自举与语义角色标注的威胁情报实体关系抽取方法[J]. 《计算机应用》唯一官方网站, 2023, 43(5): 1445-1453. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||