Joint triple extraction model combining pointer network and relational embedding

doi:10.11772/j.issn.1001-9081.2022060846

Abstract

Abstract:

Aiming at the problems of complex entity overlap situations and difficulties in extracting multiple relational triples in natural language texts， a joint triple extraction model combining pointer network and relational embedding was proposed. Firstly， the BERT （Bidirectional Encoder Representations from Transformers） pre-training model was used to encode and represent the input sentence. Secondly， the head and tail pointer labeling was used to extract all subjects in the sentence， and the attention mechanism guided by subjects and relations was used to distinguish the importance of different relation labels to each word， so that the relation label information was added to the sentence embedding. Finally， for the subjects and each relation， the corresponding object was extracted by using the pointer labeling and cascade structure， and the relational triples were generated. Extensive experiments were conducted on two datasets， New York Times （NYT） and Web Natural Language Generation （WebNLG）， and the results show that the proposed model has better overall performance than the current best Novel Cascade Binary Tagging Framework （CasRel） model by 1.9 and 0.7 percentage points respectively； compared with the Extract-Then-Label method with Span-based scheme （ETL-Span） model， the performance improvements of the proposed model are more than 6.0% and more than 3.7% in the comparison experiments with 1 to 5 triples， respectively. Especially in complex sentences with more than 5 triples， the proposed model has the F1 score improved by 8.5 and 1.3 percentage points respectively. And stable extraction ability of this model is maintained while capturing more entity pairs， which further verifies the effectiveness of this model in triple overlap problem.

Key words: information extraction, overlapping relationship, triple extraction, BERT (Bidirectional Encoder Representations from Transformers), attention mechanism, deep learning

摘要：

针对自然语言文本中实体重叠情况复杂、多个关系三元组提取困难的问题，提出一种融合指针网络与关系嵌入的三元组联合抽取模型。首先利用BERT （Bidirectional Encoder Representations from Transformers）预训练模型对输入句子进行编码表示；然后利用首尾指针标注抽取句子中的所有主体，并采用主体和关系引导的注意力机制来区分不同关系标签对每个单词的重要程度，从而将关系标签信息加入句子嵌入中；最后针对主体及每一种关系利用指针标注和级联结构抽取出相应的客体，并生成关系三元组。在纽约时报（NYT）和网络自然文本生成（WebNLG）两个数据集上进行了大量实验，结果表明，所提模型相较于目前最优的级联二元标记框架（CasRel）模型，整体性能分别提升了1.9和0.7个百分点；与基于跨度的提取标记方法（ETL-Span）模型相比，在含有1~5个三元组的对比实验中分别取得了大于6.0%和大于3.7%的性能提升，特别是在含有5个以上三元组的复杂句子中，所提模型的F1值分别提升了8.5和1.3个百分点，且在捕获更多实体对的同时能够保持稳定的提取能力，进一步验证了该模型在三元组重叠问题中的有效性。

关键词: 信息提取, 重叠关系, 三元组提取, BERT, 注意力机制, 深度学习

CLC Number:

TP391.1

Yuxin TUO, Tao XUE. Joint triple extraction model combining pointer network and relational embedding[J]. Journal of Computer Applications, 2023, 43(7): 2116-2124.

拓雨欣, 薛涛. 融合指针网络与关系嵌入的三元组联合抽取模型[J]. 《计算机应用》唯一官方网站, 2023, 43(7): 2116-2124.

Figures/Tables 11

References 26

1	刘峤，李杨，段宏，等.知识图谱构建技术综述［J］.计算机研究与发展， 2016， 53（3）： 582-600. 10.7544/issn1000-1239.2016.20148228
	LIU Q， LI Y， DUAN H， et al. Knowledge graph construction techniques［J］. Journal of Computer Research and Development， 2016， 53（3）： 582-600. 10.7544/issn1000-1239.2016.20148228
2	YANG B S， CARDIE C. Joint inference for fine-grained opinion extraction［C］// Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics （Volume 1： Long Papers）. Stroudsburg， PA： ACL， 2013： 1640-1649.
3	MIWA M， BANSAL M. End-to-end relation extraction using LSTMs on sequences and tree structures［C］// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics （Volume 1： Long Papers）. Stroudsburg， PA： ACL， 2016： 1105-1116. 10.18653/v1/p16-1105
4	KATIYAR A， CARDIE C. Going out on a limb： joint extraction of entity mentions and relations without dependency trees［C］// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics （Volume 1： Long Papers）. Stroudsburg， PA： ACL， 2017： 917-928. 10.18653/v1/p17-1085
5	ZENG X R， ZENG D J， HE S Z， et al. Extracting relational facts by an end-to-end neural model with copy mechanism［C］// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics （Volume 1： Long Papers）. Stroudsburg， PA： ACL， 2018： 506-514. 10.18653/v1/p18-1047
6	TAKANOBU R， ZHANG T Y， LIU J X， et al. A hierarchical framework for relation extraction with reinforcement learning［C］// Proceedings of the 33rd AAAI Conference on Artificial Intelligence. Palo Alto， CA： AAAI Press， 2019： 7072-7079. 10.1609/aaai.v33i01.33017072
7	WEI Z P， SU J L， WANG Y， et al. A novel cascade binary tagging framework for relational triple extraction［C］// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg， PA： ACL， 2020： 1476-1488. 10.18653/v1/2020.acl-main.136
8	CHAN Y S， ROTH D. Exploiting syntactico-semantic structures for relation extraction［C］// Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics： Human Language Technologies. Stroudsburg， PA： ACL， 2011： 551-560.
9	LI Q， JI H. Incremental joint extraction of entity mentions and relations［C］// Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics （Volume 1： Long Papers）. Stroudsburg， PA： ACL， 2014： 402-412. 10.3115/v1/p14-1038
10	YU X F， LAM W. Jointly identifying entities and extracting relations in encyclopedia text via a graphical model approach［C］// Proceedings of the 23rd International Conference on Computational Linguistics： Posters Volume. ［S.l.］： Coling 2010 Organizing Committee， 2010： 1399-1407.
11	MIWA M， SASAKI Y. Modeling joint entity and relation extraction with table representation［C］// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg， PA： ACL， 2014： 1858-1869. 10.3115/v1/d14-1200
12	KATIYAR A， CARDIE C. Investigating LSTMs for joint extraction of opinion entities and relations［C］// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics （Volume 1： Long Papers）. Stroudsburg， PA： ACL， 2016： 919-929. 10.18653/v1/p16-1087
13	ZHENG S C， WANG F， BAO H Y， et al. Joint extraction of entities and relations based on a novel tagging scheme［C］// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics （Volume 1： Long Papers）. Stroudsburg， PA： ACL， 2017： 1227-1236. 10.18653/v1/p17-1113
14	DAI D， XIAO X Y， LYU Y J， et al. Joint extraction of entities and overlapping relations using position-attentive sequence labeling［C］// Proceedings of the 33rd AAAI Conference on Artificial Intelligence. Palo Alto， CA： AAAI Press， 2019： 6300-6308. 10.1609/aaai.v33i01.33016300
15	YU B W， ZHANG Z Y， SU X B， et al. Joint extraction of entities and relations based on a novel decomposition strategy［C］// Proceedings of the 24th European Conference on Artificial Intelligence. Amsterdam： IOS Press， 2020： 2282-2289.
16	FU T J， LI P H， MA W Y. GraphRel： modeling text as relational graphs for joint entity and relation extraction［C］// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg， PA： ACL， 2019： 1409-1418. 10.18653/v1/p19-1136
17	ZENG D J， LIU K， CHEN Y B， et al. Distant supervision for relation extraction via piecewise convolutional neural networks［C］// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Stroudsburg， PA： ACL， 2015： 1753-1762. 10.18653/v1/d15-1203
18	ZENG X R， HE S Z， ZENG D J， et al. Learning the extraction order of multiple relational facts in a sentence with reinforcement learning［C］// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Stroudsburg， PA： ACL， 2019： 367-377. 10.18653/v1/d19-1035
19	ZENG D J， ZHANG H R， LIU Q Y. CopyMTL： copy mechanism for joint extraction of entities and relations with multi-task learning［C］// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto， CA： AAAI Press， 2020： 9507-9514. 10.1609/aaai.v34i05.6495
20	NAYAK T， NG H T. Effective modeling of encoder-decoder architecture for joint entity and relation extraction［C］// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto， CA： AAAI Press， 2020： 8528-8535. 10.1609/aaai.v34i05.6374
21	DEVLIN J， CHANG M W， LEE K， et al. BERT： pre-training of deep bidirectional transformers for language understanding［C］// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics： Human Language Technologies， Volume 1（Long and Short Papers）. Stroudsburg， PA： ACL， 2019： 4171-4186. 10.18653/v1/n18-2
22	WANG G Y， LI C Y， WANG W L， et al. Joint embedding of words and labels for text classification［C］// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics （Volume 1： Long Papers）. Stroudsburg， PA： ACL， 2018： 2321-2331. 10.18653/v1/p18-1216
23	DE VRIES H， STRUB F， MARY J， et al. Modulating early visual processing by language［C］// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2017： 6597-6607.
24	RIEDEL S， YAO L M， McCALLUM A. Modeling relations and their mentions without labeled text［C］// Proceedings of the 2010 Joint European Conference on Machine Learning and Knowledge Discovery in Databases， LNCS 6323. Berlin： Springer， 2010： 148-163. 10.5715/jnlp.4.3_1
25	GARDENT C， SHIMORINA A， NARAYAN S， et al. Creating training corpora for NLG micro-planning［C］// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics （Volume 1： Long Papers）. Stroudsburg， PA： ACL， 2017： 179-188. 10.18653/v1/p17-1017
26	WANG Y， YU B， ZHANG Y， et al. TPLinker： single-stage joint extraction of entities and relations through token pair linking［C］// Proceedings of the 28th International Conference on Computational Linguistics. Barcelona， Spain： International Committee on Computational Linguistics， 2020： 1572-1582. 10.18653/v1/2020.coling-main.138

统计项	NYT的样本数		WebNLG的样本数
统计项	训练集	测试集	训练集	测试集
Normal	37 013	3 266	1 596	246
EPO	9 782	978	227	26
SEO	14 735	1 297	3 406	457

统计项	NYT的样本数		WebNLG的样本数
统计项	训练集	测试集	训练集	测试集
Normal	37 013	3 266	1 596	246
EPO	9 782	978	227	26
SEO	14 735	1 297	3 406	457

模型	NYT			WebNLG
模型	P	R	F₁	P	R	F₁
NovelTagging	62.4	31.7	42.0	52.5	19.3	28.3
CopyRE_One	59.4	53.1	56.0	32.2	28.9	30.5
CopyRE_Mul	61.0	56.6	58.7	37.7	36.4	37.1
GraphRel_1p	62.9	57.3	60.0	42.3	39.2	40.7
GraphRel_2p	63.9	60.0	61.9	44.7	41.1	42.9
ETL-Span	84.3	82.0	83.1	84.0	91.5	87.6
CasRel	89.7	89.5	89.6	93.4	90.1	91.8
本文模型	92.4	90.6	91.5	93.8	91.2	92.5

模型	NYT			WebNLG
模型	P	R	F₁	P	R	F₁
NovelTagging	62.4	31.7	42.0	52.5	19.3	28.3
CopyRE_One	59.4	53.1	56.0	32.2	28.9	30.5
CopyRE_Mul	61.0	56.6	58.7	37.7	36.4	37.1
GraphRel_1p	62.9	57.3	60.0	42.3	39.2	40.7
GraphRel_2p	63.9	60.0	61.9	44.7	41.1	42.9
ETL-Span	84.3	82.0	83.1	84.0	91.5	87.6
CasRel	89.7	89.5	89.6	93.4	90.1	91.8
本文模型	92.4	90.6	91.5	93.8	91.2	92.5

数据集	模型	参数量/10⁶	每轮平均训练时间/s	每条平均推理时间/ms
NYT	CasRel	107.7	1 947.4	54.0
	TPLinker^［26］	109.6	1 771.9	14.2
	本文模型	102.9	926.4	34.8
WebNLG	CasRel	107.9	901.6	76.8
	TPLinker^［26］	110.2	845.4	18.0
	本文模型	103.1	81.5	42.3