Joint extraction model of entities and relations based on memory enhancement and span screening

doi:10.11772/j.issn.1001-9081.2024111567

Abstract

Abstract:

Entity and Relation Extraction （ERE） is typically handled in a pipeline manner. However， such an approach relies only on the output of the preceding task， resulting in limited information interaction between named entity recognition and relation extraction， and is susceptible to error propagation. To address these challenges， a Memory-Enhanced model for Entity and Relation Extraction （MEERE） was proposed. This model introduced a memory-like mechanism， allowing each task not only to utilize the output of the preceding task， but also to influence it in reverse， thereby capturing complex interactions between entities and relations. To further mitigate error propagation， an entity span screening mechanism was incorporated. This mechanism dynamically screened and verified entity spans in the joint module， ensuring that only high-quality entities were used for relation extraction， thus enhancing the robustness and accuracy of the model. A table decoding method was finally employed to handle relation overlap. Experimental results on three widely used benchmark datasets （ACE05， SciERC， and CoNLL04） demonstrated significant advantages of MEERE in ERE tasks. In specific， on the CoNLL04 dataset， MEERE outperformed Tab-Seq model in both named entity recognition and relation extraction tasks with a 0.5 percentage point increase in named entity recognition F1-score and a 3.0 percentage point improvement in relation strict evaluation F1-score； compared to PURE-F model， MEERE achieved no less than a ninefold acceleration effect with improved relation extraction performance. These findings confirm the effectiveness of the proposed memory enhancement model in exploring interactions between entities and relations.

Key words: joint extraction of entities and relations, memory enhancement, span screening, Pre-Trained Language Model (PLM), cross-sentence context

摘要：

实体和关系抽取（ERE）通常采用流水线的方式进行处理，但这种流水线方法仅依赖于前一个任务的输出，导致命名实体识别和关系抽取之间出现信息交互问题，且容易引发误差传播问题。针对以上问题，提出一种面向实体和关系抽取的记忆增强模型（MEERE）。该模型引入类似记忆的机制，使每个任务不仅能利用前一任务的输出，还能反向影响前一任务，从而捕获实体和关系间的复杂交互。为进一步减轻误差传播，同时引入实体跨度筛选机制。该机制通过在联合模块中动态地筛选和验证实体跨度，确保只有高质量的实体被用于关系抽取，从而提升模型的鲁棒性和准确性。最后利用表格解码方式处理关系重叠问题。在3个广泛使用的基准数据集（ACE05、SciERC和CoNLL04）上的实验结果表明，MEERE在ERE任务上表现出了显著的优势。与Tab-Seq在CoNLL04数据集上相比，MEERE在命名实体识别和关系抽取上的性能都有显著提升，命名实体识别的F1值提升了0.5个百分点，关系严格评估的F1值提升了3.0个百分点；相较于PURE-F模型，MEERE实现了不少于9倍的加速效果，并且关系抽取性能更佳。这些结果验证了所提出的记忆增强模型在探索实体和关系交互作用方面的有效性。

关键词: 实体和关系联合抽取, 记忆增强, 跨度筛选, 预训练语言模型, 跨句子上下文

CLC Number:

TP391.1

Shuang LIU, Guijun LUO, Jiana MENG. Joint extraction model of entities and relations based on memory enhancement and span screening[J]. Journal of Computer Applications, 2025, 45(11): 3564-3572.

刘爽, 罗桂君, 孟佳娜. 基于记忆增强和跨度筛选的实体和关系联合抽取模型[J]. 《计算机应用》唯一官方网站, 2025, 45(11): 3564-3572.

Figures/Tables 12

References 35

[1]	鄂海红，张文静，肖思琪，等. 深度学习实体关系抽取研究综述［J］. 软件学报， 2019， 30（6）：1793-1818.
	E H H， ZHANG W J， XIAO S Q， et al. Survey of entity relationship extraction based on deep learning［J］. Journal of Software， 2019， 30（6）： 1793-1818.
[2]	LUAN Y， HE L， OSTENDORF M， et al. Multi-task identification of entities， relations， and coreference for scientific knowledge graph construction［C］// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Stroudsburg： ACL， 2018： 3219-3232.
[3]	LIN Y， SHEN S， LIU Z， et al. Neural relation extraction with selective attention over instances［C］// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics （Volume 1： Long Papers）. Stroudsburg： ACL， 2016： 2124-2133.
[4]	YAN Z， YANG S， LIU W， et al. Joint entity and relation extraction with span pruning and hypergraph neural networks［C］// Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. Stroudsburg： ACL， 2023： 7512-7526.
[5]	ZHAO T， YAN Z， CAO Y， et al. Asking effective and diverse questions： a machine reading comprehension based framework for joint entity-relation extraction［C］// Proceedings of the 29th International Joint Conferences on Artificial Intelligence. California： IJCAI.org， 2020： 3948-3954.
[6]	LI X， YIN F， SUN Z， et al. Entity-relation extraction as multi-turn question answering［C］// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg： ACL，2019： 1340-1350.
[7]	TAKANOBU R， ZHANG T， LIU J， et al. A hierarchical framework for relation extraction with reinforcement learning［C］// Proceedings of the 33rd AAAI Conference on Artificial Intelligence. Palo Alto： AAAI Press， 2019： 7072-7079.
[8]	YE D， LIN Y， LI P， et al. Packed levitated marker for entity and relation extraction［C］// Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics （Volume 1： Long Papers）. Stroudsburg： ACL， 2022： 4904-4917.
[9]	YAN Z， JIA Z， TU K. An empirical study of pipeline vs. joint approaches to entity and relation extraction［C］// Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing （Volume 2： Short Papers）. Stroudsburg： ACL， 2022： 437-443.
[10]	MIWA M， BANSAL M. End-to-end relation extraction using LSTMs on sequences and tree structures［C］// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics （Volume 1： Long Papers）. Stroudsburg： ACL， 2016： 1105-1116.
[11]	WANG J， LU W. Two are better than one： joint entity and relation extraction with table-sequence encoders［C］// Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. Stroudsburg： ACL， 2020： 1706-1721.
[12]	WANG Y， SUN C， WU Y， et al. UniRE： a unified label space for entity relation extraction［C］// Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing （Volume 1： Long Papers）. Stroudsburg： ACL， 2021： 220-231.
[13]	GUPTA P， SCHÜTZE H， ANDRASSY B. Table filling multi-task recurrent neural network for joint entity and relation extraction［C］// Proceedings of the 26th International Conference on Computational Linguistics： Technical Papers. ［S.l.］： The COLING 2016 Organizing Committee， 2016： 2537-2547.
[14]	SUN C， GONG Y， WU Y， et al. Joint type inference on entities and relations via graph convolutional networks［C］// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg： ACL， 2019： 1361-1370.
[15]	NGUYEN M V， LAI V D， NGUYEN T H. Cross-task instance representation interactions and label dependencies for joint information extraction with graph convolutional networks［C］// Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics： Human Language Technologies. Stroudsburg： ACL， 2021： 27-38.
[16]	YANG B， CARDIE C. Joint inference for fine-grained opinion extraction［C］// Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics （Volume 1： Long Papers）. Stroudsburg： ACL， 2013： 1640-1649.
[17]	KATIYAR A， CARDIE C. Investigating LSTMs for joint extraction of opinion entities and relations［C］// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics （Volume 1： Long Papers）. Stroudsburg： ACL， 2016： 919-929.
[18]	WADDEN D， WENNBERG U， LUAN Y， et al. Entity， relation， and event extraction with contextualized span representations ［C］// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing （EMNLP-IJCNLP）. Stroudsburg： ACL， 2019： 5784-5789.
[19]	SHEN Y， MA X， TANG Y， et al. A trigger-sense memory flow framework for joint entity and relation extraction［C］// Proceedings of the Web Conference 2021. New York： ACM， 2021： 1704-1715.
[20]	HUANG J， LI X， DU Y， et al. An aspect-opinion joint extraction model for target-oriented opinion words extraction on global space［J］. Applied Intelligence， 2025， 55： No.23.
[21]	LV F， LIANG T， FEI Z， et al. Progressive multigranularity information propagation for coupled aspect-opinion extraction［J］. IEEE Transactions on Neural Networks and Learning Systems， 2024， 35（6）： 7577-7586.
[22]	MA D， XU J， WANG Z， et al. Entity-aspect-opinion-sentiment quadruple extraction for fine-grained sentiment analysis［EB/OL］. ［2024-10-13］..
[23]	FEI H， WU S， LI J， et al. LasUIE： unifying information extraction with latent adaptive structure-aware generative language model［C］// Proceedings of the 36th International Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2022： 15460-15475.
[24]	WAN Z， CHENG F， MAO Z， et al. GPT-RE： in-context learning for relation extraction using large language models［C］// Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. Stroudsburg： ACL， 2023： 3534-3547.
[25]	RAFFEL C， SHAZEER N， ROBERTS A， et al. Exploring the limits of transfer learning with a unified text-to-text transformer［J］. Journal of Machine Learning Research， 2020， 21： 1-67.
[26]	LEWIS M， LIU Y， GOYAL N， et al. BART： denoising sequence-to-sequence pre-training for natural language generation， translation， and comprehension［C］// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg： ACL， 2020： 7871-7880.
[27]	YE H， ZHANG N， CHEN H， et al. Generative knowledge graph construction： a review［C］// Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. Stroudsburg： ACL， 2022： 1-17.
[28]	ZHAO T， YAN Z， CAO Y， et al. A unified multi-task learning framework for joint extraction of entities and relations［C］// Proceedings of the 35th AAAI Conference on Artificial Intelligence. Palo Alto： AAAI Press， 2021： 14524-14531.
[29]	LI S， JI H， HAN J. Document-level event argument extraction by conditional generation［C］// Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics： Human Language Technologies. Stroudsburg： ACL， 2021： 894-908.
[30]	GIORGI J， BADER G D， WANG B. A sequence-to-sequence approach for document-level relation extraction［C］// Proceedings of the 21st Workshop on Biomedical Language Processing. Stroudsburg： ACL， 2022： 10-25.
[31]	LIU T， JIANG Y E， MONATH N， et al. Autoregressive structured prediction with language models［C］// Findings of the Association for Computational Linguistics： EMNLP 2022. Stroudsburg： ACL， 2022： 993-1005.
[32]	ZHONG Z， CHEN D. A frustratingly easy approach for entity and relation extraction［C］// Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics： Human Language Technologies. Stroudsburg： ACL， 2021： 50-61.
[33]	DOZAT T， MANNING C D. Deep biaffine attention for neural dependency parsing［EB/OL］. ［2024-10-13］. .
[34]	LUAN Y， WADDEN D， HE L， et al. A general framework for information extraction using dynamic span graphs［C］// Proceedings of the 2019 Conference of the North Association for Computational Linguistics， Volume 1 （Long and Short Papers）. Stroudsburg： ACL， 2019： 3036-3046.
[35]	WU W， WANG F， YUAN A， et al. CorefQA： coreference resolution as query-based span prediction［C］// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg： ACL， 2020： 6953-6963.

数据集	样本数			实体数	关系数
数据集	训练集	验证集	测试集	实体数	关系数
ACE05	10 051	2 424	2 050	7	6
SciERC	1 864	275	551	6	7
CoNLL04	922	231	288	4	5

数据集	样本数			实体数	关系数
数据集	训练集	验证集	测试集	实体数	关系数
ACE05	10 051	2 424	2 050	7	6
SciERC	1 864	275	551	6	7
CoNLL04	922	231	288	4	5

设备名	环境配置
操作系统	Ubuntu20.04
CPU	15 vCPU Intel Xeon Platinum 8474C
GPU	GeForce RTX 3090
内存	80 GB
Python	3.8.10
深度学习框架	PyTorch1.11.0+CU113

设备名	环境配置
操作系统	Ubuntu20.04
CPU	15 vCPU Intel Xeon Platinum 8474C
GPU	GeForce RTX 3090
内存	80 GB
Python	3.8.10
深度学习框架	PyTorch1.11.0+CU113

参数	ACE05	SciERC	CoNLL04
学习率	0.000 05	0.000 05	0.000 05
训练次数	200	300	200
批次大小	16	16	16
丢弃值	0.4	0.4	0.4
梯度修剪值	5.0	5.0	5.0
权重衰减率	0.000 01	0.000 01	0.000 01
早停轮数	30	30	30
隐藏层大小	150	150	150
预热率	0.2	0.2	0.2
跨度系数	0.5	0.5	0.5
上下文窗口大小	300	200	400
跨度长度限制	8	12	8
距离阈值	1.4	1.4	1.2