融合指针网络与关系嵌入的三元组联合抽取模型

doi:10.11772/j.issn.1001-9081.2022060846

《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (7): 2116-2124.DOI: 10.11772/j.issn.1001-9081.2022060846

所属专题：人工智能

融合指针网络与关系嵌入的三元组联合抽取模型

拓雨欣, 薛涛()

西安工程大学计算机科学学院，西安 710600

收稿日期:2022-06-13 修回日期:2022-09-05 接受日期:2022-09-06 发布日期:2022-09-22 出版日期:2023-07-10
通讯作者: 薛涛
作者简介:拓雨欣（1998—），女，陕西西安人，硕士研究生，主要研究方向：知识图谱、关系抽取；
薛涛（1973—），男，陕西西安人，教授，博士，主要研究方向：自然语言处理、大数据。
基金资助:
陕西省技术创新引导计划项目(2020CGXNG-012)

Joint triple extraction model combining pointer network and relational embedding

Yuxin TUO, Tao XUE()

School of Computer Science，Xi’an Polytechnic University，Xi’an Shaanxi 710600，China

Received:2022-06-13 Revised:2022-09-05 Accepted:2022-09-06 Online:2022-09-22 Published:2023-07-10
Contact: Tao XUE
About author:TUO Yuxin， born in 1998， M. S. candidate. Her research interests include knowledge graph， relation extraction.
XUE Tao， born in 1973， Ph. D.， professor. His research interests include natural language processing， big data.
Supported by:
Shaanxi Provincial Technical Innovation Guidance Program(2020CGXNG-012)

摘要/Abstract

摘要：

针对自然语言文本中实体重叠情况复杂、多个关系三元组提取困难的问题，提出一种融合指针网络与关系嵌入的三元组联合抽取模型。首先利用BERT （Bidirectional Encoder Representations from Transformers）预训练模型对输入句子进行编码表示；然后利用首尾指针标注抽取句子中的所有主体，并采用主体和关系引导的注意力机制来区分不同关系标签对每个单词的重要程度，从而将关系标签信息加入句子嵌入中；最后针对主体及每一种关系利用指针标注和级联结构抽取出相应的客体，并生成关系三元组。在纽约时报（NYT）和网络自然文本生成（WebNLG）两个数据集上进行了大量实验，结果表明，所提模型相较于目前最优的级联二元标记框架（CasRel）模型，整体性能分别提升了1.9和0.7个百分点；与基于跨度的提取标记方法（ETL-Span）模型相比，在含有1~5个三元组的对比实验中分别取得了大于6.0%和大于3.7%的性能提升，特别是在含有5个以上三元组的复杂句子中，所提模型的F1值分别提升了8.5和1.3个百分点，且在捕获更多实体对的同时能够保持稳定的提取能力，进一步验证了该模型在三元组重叠问题中的有效性。

关键词: 信息提取, 重叠关系, 三元组提取, BERT, 注意力机制, 深度学习

Abstract:

Aiming at the problems of complex entity overlap situations and difficulties in extracting multiple relational triples in natural language texts， a joint triple extraction model combining pointer network and relational embedding was proposed. Firstly， the BERT （Bidirectional Encoder Representations from Transformers） pre-training model was used to encode and represent the input sentence. Secondly， the head and tail pointer labeling was used to extract all subjects in the sentence， and the attention mechanism guided by subjects and relations was used to distinguish the importance of different relation labels to each word， so that the relation label information was added to the sentence embedding. Finally， for the subjects and each relation， the corresponding object was extracted by using the pointer labeling and cascade structure， and the relational triples were generated. Extensive experiments were conducted on two datasets， New York Times （NYT） and Web Natural Language Generation （WebNLG）， and the results show that the proposed model has better overall performance than the current best Novel Cascade Binary Tagging Framework （CasRel） model by 1.9 and 0.7 percentage points respectively； compared with the Extract-Then-Label method with Span-based scheme （ETL-Span） model， the performance improvements of the proposed model are more than 6.0% and more than 3.7% in the comparison experiments with 1 to 5 triples， respectively. Especially in complex sentences with more than 5 triples， the proposed model has the F1 score improved by 8.5 and 1.3 percentage points respectively. And stable extraction ability of this model is maintained while capturing more entity pairs， which further verifies the effectiveness of this model in triple overlap problem.

Key words: information extraction, overlapping relationship, triple extraction, BERT (Bidirectional Encoder Representations from Transformers), attention mechanism, deep learning

中图分类号:

TP391.1

拓雨欣, 薛涛. 融合指针网络与关系嵌入的三元组联合抽取模型[J]. 计算机应用, 2023, 43(7): 2116-2124.

Yuxin TUO, Tao XUE. Joint triple extraction model combining pointer network and relational embedding[J]. Journal of Computer Applications, 2023, 43(7): 2116-2124.

图/表 11

参考文献 26

1	刘峤，李杨，段宏，等.知识图谱构建技术综述［J］.计算机研究与发展， 2016， 53（3）： 582-600. 10.7544/issn1000-1239.2016.20148228
	LIU Q， LI Y， DUAN H， et al. Knowledge graph construction techniques［J］. Journal of Computer Research and Development， 2016， 53（3）： 582-600. 10.7544/issn1000-1239.2016.20148228
2	YANG B S， CARDIE C. Joint inference for fine-grained opinion extraction［C］// Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics （Volume 1： Long Papers）. Stroudsburg， PA： ACL， 2013： 1640-1649.
3	MIWA M， BANSAL M. End-to-end relation extraction using LSTMs on sequences and tree structures［C］// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics （Volume 1： Long Papers）. Stroudsburg， PA： ACL， 2016： 1105-1116. 10.18653/v1/p16-1105
4	KATIYAR A， CARDIE C. Going out on a limb： joint extraction of entity mentions and relations without dependency trees［C］// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics （Volume 1： Long Papers）. Stroudsburg， PA： ACL， 2017： 917-928. 10.18653/v1/p17-1085
5	ZENG X R， ZENG D J， HE S Z， et al. Extracting relational facts by an end-to-end neural model with copy mechanism［C］// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics （Volume 1： Long Papers）. Stroudsburg， PA： ACL， 2018： 506-514. 10.18653/v1/p18-1047
6	TAKANOBU R， ZHANG T Y， LIU J X， et al. A hierarchical framework for relation extraction with reinforcement learning［C］// Proceedings of the 33rd AAAI Conference on Artificial Intelligence. Palo Alto， CA： AAAI Press， 2019： 7072-7079. 10.1609/aaai.v33i01.33017072
7	WEI Z P， SU J L， WANG Y， et al. A novel cascade binary tagging framework for relational triple extraction［C］// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg， PA： ACL， 2020： 1476-1488. 10.18653/v1/2020.acl-main.136
8	CHAN Y S， ROTH D. Exploiting syntactico-semantic structures for relation extraction［C］// Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics： Human Language Technologies. Stroudsburg， PA： ACL， 2011： 551-560.
9	LI Q， JI H. Incremental joint extraction of entity mentions and relations［C］// Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics （Volume 1： Long Papers）. Stroudsburg， PA： ACL， 2014： 402-412. 10.3115/v1/p14-1038
10	YU X F， LAM W. Jointly identifying entities and extracting relations in encyclopedia text via a graphical model approach［C］// Proceedings of the 23rd International Conference on Computational Linguistics： Posters Volume. ［S.l.］： Coling 2010 Organizing Committee， 2010： 1399-1407.
11	MIWA M， SASAKI Y. Modeling joint entity and relation extraction with table representation［C］// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg， PA： ACL， 2014： 1858-1869. 10.3115/v1/d14-1200
12	KATIYAR A， CARDIE C. Investigating LSTMs for joint extraction of opinion entities and relations［C］// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics （Volume 1： Long Papers）. Stroudsburg， PA： ACL， 2016： 919-929. 10.18653/v1/p16-1087
13	ZHENG S C， WANG F， BAO H Y， et al. Joint extraction of entities and relations based on a novel tagging scheme［C］// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics （Volume 1： Long Papers）. Stroudsburg， PA： ACL， 2017： 1227-1236. 10.18653/v1/p17-1113
14	DAI D， XIAO X Y， LYU Y J， et al. Joint extraction of entities and overlapping relations using position-attentive sequence labeling［C］// Proceedings of the 33rd AAAI Conference on Artificial Intelligence. Palo Alto， CA： AAAI Press， 2019： 6300-6308. 10.1609/aaai.v33i01.33016300
15	YU B W， ZHANG Z Y， SU X B， et al. Joint extraction of entities and relations based on a novel decomposition strategy［C］// Proceedings of the 24th European Conference on Artificial Intelligence. Amsterdam： IOS Press， 2020： 2282-2289.
16	FU T J， LI P H， MA W Y. GraphRel： modeling text as relational graphs for joint entity and relation extraction［C］// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg， PA： ACL， 2019： 1409-1418. 10.18653/v1/p19-1136
17	ZENG D J， LIU K， CHEN Y B， et al. Distant supervision for relation extraction via piecewise convolutional neural networks［C］// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Stroudsburg， PA： ACL， 2015： 1753-1762. 10.18653/v1/d15-1203
18	ZENG X R， HE S Z， ZENG D J， et al. Learning the extraction order of multiple relational facts in a sentence with reinforcement learning［C］// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Stroudsburg， PA： ACL， 2019： 367-377. 10.18653/v1/d19-1035
19	ZENG D J， ZHANG H R， LIU Q Y. CopyMTL： copy mechanism for joint extraction of entities and relations with multi-task learning［C］// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto， CA： AAAI Press， 2020： 9507-9514. 10.1609/aaai.v34i05.6495
20	NAYAK T， NG H T. Effective modeling of encoder-decoder architecture for joint entity and relation extraction［C］// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto， CA： AAAI Press， 2020： 8528-8535. 10.1609/aaai.v34i05.6374
21	DEVLIN J， CHANG M W， LEE K， et al. BERT： pre-training of deep bidirectional transformers for language understanding［C］// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics： Human Language Technologies， Volume 1（Long and Short Papers）. Stroudsburg， PA： ACL， 2019： 4171-4186. 10.18653/v1/n18-2
22	WANG G Y， LI C Y， WANG W L， et al. Joint embedding of words and labels for text classification［C］// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics （Volume 1： Long Papers）. Stroudsburg， PA： ACL， 2018： 2321-2331. 10.18653/v1/p18-1216
23	DE VRIES H， STRUB F， MARY J， et al. Modulating early visual processing by language［C］// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2017： 6597-6607.
24	RIEDEL S， YAO L M， McCALLUM A. Modeling relations and their mentions without labeled text［C］// Proceedings of the 2010 Joint European Conference on Machine Learning and Knowledge Discovery in Databases， LNCS 6323. Berlin： Springer， 2010： 148-163. 10.5715/jnlp.4.3_1
25	GARDENT C， SHIMORINA A， NARAYAN S， et al. Creating training corpora for NLG micro-planning［C］// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics （Volume 1： Long Papers）. Stroudsburg， PA： ACL， 2017： 179-188. 10.18653/v1/p17-1017
26	WANG Y， YU B， ZHANG Y， et al. TPLinker： single-stage joint extraction of entities and relations through token pair linking［C］// Proceedings of the 28th International Conference on Computational Linguistics. Barcelona， Spain： International Committee on Computational Linguistics， 2020： 1572-1582. 10.18653/v1/2020.coling-main.138

统计项	NYT的样本数		WebNLG的样本数
统计项	训练集	测试集	训练集	测试集
Normal	37 013	3 266	1 596	246
EPO	9 782	978	227	26
SEO	14 735	1 297	3 406	457

统计项	NYT的样本数		WebNLG的样本数
统计项	训练集	测试集	训练集	测试集
Normal	37 013	3 266	1 596	246
EPO	9 782	978	227	26
SEO	14 735	1 297	3 406	457

模型	NYT			WebNLG
模型	P	R	F₁	P	R	F₁
NovelTagging	62.4	31.7	42.0	52.5	19.3	28.3
CopyRE_One	59.4	53.1	56.0	32.2	28.9	30.5
CopyRE_Mul	61.0	56.6	58.7	37.7	36.4	37.1
GraphRel_1p	62.9	57.3	60.0	42.3	39.2	40.7
GraphRel_2p	63.9	60.0	61.9	44.7	41.1	42.9
ETL-Span	84.3	82.0	83.1	84.0	91.5	87.6
CasRel	89.7	89.5	89.6	93.4	90.1	91.8
本文模型	92.4	90.6	91.5	93.8	91.2	92.5

模型	NYT			WebNLG
模型	P	R	F₁	P	R	F₁
NovelTagging	62.4	31.7	42.0	52.5	19.3	28.3
CopyRE_One	59.4	53.1	56.0	32.2	28.9	30.5
CopyRE_Mul	61.0	56.6	58.7	37.7	36.4	37.1
GraphRel_1p	62.9	57.3	60.0	42.3	39.2	40.7
GraphRel_2p	63.9	60.0	61.9	44.7	41.1	42.9
ETL-Span	84.3	82.0	83.1	84.0	91.5	87.6
CasRel	89.7	89.5	89.6	93.4	90.1	91.8
本文模型	92.4	90.6	91.5	93.8	91.2	92.5

数据集	模型	参数量/10⁶	每轮平均训练时间/s	每条平均推理时间/ms
NYT	CasRel	107.7	1 947.4	54.0
	TPLinker^［26］	109.6	1 771.9	14.2
	本文模型	102.9	926.4	34.8
WebNLG	CasRel	107.9	901.6	76.8
	TPLinker^［26］	110.2	845.4	18.0
	本文模型	103.1	81.5	42.3

融合指针网络与关系嵌入的三元组联合抽取模型

Joint triple extraction model combining pointer network and relational embedding

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 11

参考文献 26

相关文章 15

编辑推荐

Metrics

元素	模型	NYT			WebNLG
元素	模型	P	R	F₁	P	R	F₁
s	CasRel^［7］	94.6	92.4	93.5	98.7	92.8	95.7
s	本文模型	95.1	93.7	94.4	98.1	94.3	96.2
（s， r）	CasRel^［7］	93.6	90.9	92.2	94.8	90.3	92.5
（s， r）	本文模型	93.9	92.8	93.3	93.5	92.7	93.1
（s， r， o）	CasRel^［7］	89.7	89.5	89.6	93.4	90.1	91.8
（s， r， o）	本文模型	92.4	90.6	91.5	93.8	91.2	92.5

[1]	秦璟, 秦志光, 李发礼, 彭悦恒. 基于概率稀疏自注意力神经网络的重性抑郁疾患诊断[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2970-2974.
[2]	王熙源, 张战成, 徐少康, 张宝成, 罗晓清, 胡伏原. 面向手术导航3D/2D配准的无监督跨域迁移网络[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2911-2918.
[3]	李力铤, 华蓓, 贺若舟, 徐况. 基于解耦注意力机制的多变量时序预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2732-2738.
[4]	潘烨新, 杨哲. 基于多级特征双向融合的小目标检测优化模型[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2871-2877.
[5]	赵志强, 马培红, 黑新宏. 基于双重注意力机制的人群计数方法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2886-2892.
[6]	李顺勇, 李师毅, 胥瑞, 赵兴旺. 基于自注意力融合的不完整多视图聚类算法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2696-2703.
[7]	黄云川, 江永全, 黄骏涛, 杨燕. 基于元图同构网络的分子毒性预测[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2964-2969.
[8]	薛凯鹏, 徐涛, 廖春节. 融合自监督和多层交叉注意力的多模态情感分析网络[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2387-2392.
[9]	汪雨晴, 朱广丽, 段文杰, 李书羽, 周若彤. 基于交互注意力机制的心理咨询文本情感分类模型[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2393-2399.
[10]	高鹏淇, 黄鹤鸣, 樊永红. 融合坐标与多头注意力机制的交互语音情感识别[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2400-2406.
[11]	刘禹含, 吉根林, 张红苹. 基于骨架图与混合注意力的视频行人异常检测方法[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2551-2557.
[12]	李钟华, 白云起, 王雪津, 黄雷雷, 林初俊, 廖诗宇. 基于图像增强的低照度人脸检测[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2588-2594.
[13]	莫尚斌, 王文君, 董凌, 高盛祥, 余正涛. 基于多路信息聚合协同解码的单通道语音增强[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2611-2617.
[14]	顾焰杰, 张英俊, 刘晓倩, 周围, 孙威. 基于时空多图融合的交通流量预测[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2618-2625.
[15]	石乾宏, 杨燕, 江永全, 欧阳小草, 范武波, 陈强, 姜涛, 李媛. 面向空气质量预测的多粒度突变拟合网络[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2643-2650.