《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (7): 2116-2124.DOI: 10.11772/j.issn.1001-9081.2022060846
所属专题: 人工智能
收稿日期:
2022-06-13
修回日期:
2022-09-05
接受日期:
2022-09-06
发布日期:
2022-09-22
出版日期:
2023-07-10
通讯作者:
薛涛
作者简介:
拓雨欣(1998—),女,陕西西安人,硕士研究生,主要研究方向:知识图谱、关系抽取;基金资助:
Received:
2022-06-13
Revised:
2022-09-05
Accepted:
2022-09-06
Online:
2022-09-22
Published:
2023-07-10
Contact:
Tao XUE
About author:
TUO Yuxin, born in 1998, M. S. candidate. Her research interests include knowledge graph, relation extraction.Supported by:
摘要:
针对自然语言文本中实体重叠情况复杂、多个关系三元组提取困难的问题,提出一种融合指针网络与关系嵌入的三元组联合抽取模型。首先利用BERT (Bidirectional Encoder Representations from Transformers)预训练模型对输入句子进行编码表示;然后利用首尾指针标注抽取句子中的所有主体,并采用主体和关系引导的注意力机制来区分不同关系标签对每个单词的重要程度,从而将关系标签信息加入句子嵌入中;最后针对主体及每一种关系利用指针标注和级联结构抽取出相应的客体,并生成关系三元组。在纽约时报(NYT)和网络自然文本生成(WebNLG)两个数据集上进行了大量实验,结果表明,所提模型相较于目前最优的级联二元标记框架(CasRel)模型,整体性能分别提升了1.9和0.7个百分点;与基于跨度的提取标记方法(ETL-Span)模型相比,在含有1~5个三元组的对比实验中分别取得了大于6.0%和大于3.7%的性能提升,特别是在含有5个以上三元组的复杂句子中,所提模型的F1值分别提升了8.5和1.3个百分点,且在捕获更多实体对的同时能够保持稳定的提取能力,进一步验证了该模型在三元组重叠问题中的有效性。
中图分类号:
拓雨欣, 薛涛. 融合指针网络与关系嵌入的三元组联合抽取模型[J]. 计算机应用, 2023, 43(7): 2116-2124.
Yuxin TUO, Tao XUE. Joint triple extraction model combining pointer network and relational embedding[J]. Journal of Computer Applications, 2023, 43(7): 2116-2124.
统计项 | NYT的样本数 | WebNLG的样本数 | ||
---|---|---|---|---|
训练集 | 测试集 | 训练集 | 测试集 | |
Normal | 37 013 | 3 266 | 1 596 | 246 |
EPO | 9 782 | 978 | 227 | 26 |
SEO | 14 735 | 1 297 | 3 406 | 457 |
表1 两个数据集的统计信息
Tab. 1 Statistics of two datasets
统计项 | NYT的样本数 | WebNLG的样本数 | ||
---|---|---|---|---|
训练集 | 测试集 | 训练集 | 测试集 | |
Normal | 37 013 | 3 266 | 1 596 | 246 |
EPO | 9 782 | 978 | 227 | 26 |
SEO | 14 735 | 1 297 | 3 406 | 457 |
模型 | NYT | WebNLG | ||||
---|---|---|---|---|---|---|
P | R | F1 | P | R | F1 | |
NovelTagging | 62.4 | 31.7 | 42.0 | 52.5 | 19.3 | 28.3 |
CopyREOne | 59.4 | 53.1 | 56.0 | 32.2 | 28.9 | 30.5 |
CopyREMul | 61.0 | 56.6 | 58.7 | 37.7 | 36.4 | 37.1 |
GraphRel1p | 62.9 | 57.3 | 60.0 | 42.3 | 39.2 | 40.7 |
GraphRel2p | 63.9 | 60.0 | 61.9 | 44.7 | 41.1 | 42.9 |
ETL-Span | 84.3 | 82.0 | 83.1 | 84.0 | 91.5 | 87.6 |
CasRel | 89.7 | 89.5 | 89.6 | 93.4 | 90.1 | 91.8 |
本文模型 | 92.4 | 90.6 | 91.5 | 93.8 | 91.2 | 92.5 |
表2 不同模型在实验数据集上的性能对比 ( %)
Tab. 2 Performance comparison of different models on experimental datasets
模型 | NYT | WebNLG | ||||
---|---|---|---|---|---|---|
P | R | F1 | P | R | F1 | |
NovelTagging | 62.4 | 31.7 | 42.0 | 52.5 | 19.3 | 28.3 |
CopyREOne | 59.4 | 53.1 | 56.0 | 32.2 | 28.9 | 30.5 |
CopyREMul | 61.0 | 56.6 | 58.7 | 37.7 | 36.4 | 37.1 |
GraphRel1p | 62.9 | 57.3 | 60.0 | 42.3 | 39.2 | 40.7 |
GraphRel2p | 63.9 | 60.0 | 61.9 | 44.7 | 41.1 | 42.9 |
ETL-Span | 84.3 | 82.0 | 83.1 | 84.0 | 91.5 | 87.6 |
CasRel | 89.7 | 89.5 | 89.6 | 93.4 | 90.1 | 91.8 |
本文模型 | 92.4 | 90.6 | 91.5 | 93.8 | 91.2 | 92.5 |
数据集 | 模型 | 参数量/106 | 每轮平均训练时间/s | 每条平均推理时间/ms |
---|---|---|---|---|
NYT | CasRel | 107.7 | 1 947.4 | 54.0 |
TPLinker[ | 109.6 | 1 771.9 | 14.2 | |
本文模型 | 102.9 | 926.4 | 34.8 | |
WebNLG | CasRel | 107.9 | 901.6 | 76.8 |
TPLinker[ | 110.2 | 845.4 | 18.0 | |
本文模型 | 103.1 | 81.5 | 42.3 |
表3 在NYT和WebNLG数据集上计算效率的比较
Tab. 3 Computational efficiency comparison on NYT and WebNLG datasets
数据集 | 模型 | 参数量/106 | 每轮平均训练时间/s | 每条平均推理时间/ms |
---|---|---|---|---|
NYT | CasRel | 107.7 | 1 947.4 | 54.0 |
TPLinker[ | 109.6 | 1 771.9 | 14.2 | |
本文模型 | 102.9 | 926.4 | 34.8 | |
WebNLG | CasRel | 107.9 | 901.6 | 76.8 |
TPLinker[ | 110.2 | 845.4 | 18.0 | |
本文模型 | 103.1 | 81.5 | 42.3 |
模型 | NYT | WebNLG | ||||
---|---|---|---|---|---|---|
P | R | F1 | P | R | F1 | |
本文模型 | 92.4 | 90.6 | 91.5 | 93.8 | 91.2 | 92.5 |
本文模型-SRGA | 90.3 | 89.2 | 89.7 | 91.3 | 90.5 | 90.9 |
本文模型-CLN | 92.1 | 88.4 | 90.2 | 93.2 | 90.5 | 91.8 |
表4 两个数据集上的消融实验结果 (%)
Tab. 4 Ablation experimental results on two datasets
模型 | NYT | WebNLG | ||||
---|---|---|---|---|---|---|
P | R | F1 | P | R | F1 | |
本文模型 | 92.4 | 90.6 | 91.5 | 93.8 | 91.2 | 92.5 |
本文模型-SRGA | 90.3 | 89.2 | 89.7 | 91.3 | 90.5 | 90.9 |
本文模型-CLN | 92.1 | 88.4 | 90.2 | 93.2 | 90.5 | 91.8 |
例句 | CasRel抽取的三元组 | 本文模型抽取的三元组 |
---|---|---|
He trained for about six months, he said, running from his house on East 23rd Street in Midwood, Brooklyn, to the Coney Island boardwalk and back, he said | (Brooklyn, contains, Midwood) (Island, neighborhood of, Brooklyn) (Midwood, neighborhood, of, Brooklyn) | (Brooklyn, contains, Midwood) (Brooklyn, contains, Island) (Midwood, neighborhood of, Brooklyn) (Island, neighborhood of, Brooklyn) |
表5 来自NYT数据集的样例测试结果
Tab. 5 Test results of one sample from NYT dataset
例句 | CasRel抽取的三元组 | 本文模型抽取的三元组 |
---|---|---|
He trained for about six months, he said, running from his house on East 23rd Street in Midwood, Brooklyn, to the Coney Island boardwalk and back, he said | (Brooklyn, contains, Midwood) (Island, neighborhood of, Brooklyn) (Midwood, neighborhood, of, Brooklyn) | (Brooklyn, contains, Midwood) (Brooklyn, contains, Island) (Midwood, neighborhood of, Brooklyn) (Island, neighborhood of, Brooklyn) |
元素 | 模型 | NYT | WebNLG | ||||
---|---|---|---|---|---|---|---|
P | R | F1 | P | R | F1 | ||
s | CasRel[ | 94.6 | 92.4 | 93.5 | 98.7 | 92.8 | 95.7 |
本文模型 | 95.1 | 93.7 | 94.4 | 98.1 | 94.3 | 96.2 | |
(s, r) | CasRel[ | 93.6 | 90.9 | 92.2 | 94.8 | 90.3 | 92.5 |
本文模型 | 93.9 | 92.8 | 93.3 | 93.5 | 92.7 | 93.1 | |
(s, r, o) | CasRel[ | 89.7 | 89.5 | 89.6 | 93.4 | 90.1 | 91.8 |
本文模型 | 92.4 | 90.6 | 91.5 | 93.8 | 91.2 | 92.5 |
表6 两个数据集上关系三元组元素抽取的结果 ( %)
Tab. 6 Results of element extraction of relational triples on two datasets
元素 | 模型 | NYT | WebNLG | ||||
---|---|---|---|---|---|---|---|
P | R | F1 | P | R | F1 | ||
s | CasRel[ | 94.6 | 92.4 | 93.5 | 98.7 | 92.8 | 95.7 |
本文模型 | 95.1 | 93.7 | 94.4 | 98.1 | 94.3 | 96.2 | |
(s, r) | CasRel[ | 93.6 | 90.9 | 92.2 | 94.8 | 90.3 | 92.5 |
本文模型 | 93.9 | 92.8 | 93.3 | 93.5 | 92.7 | 93.1 | |
(s, r, o) | CasRel[ | 89.7 | 89.5 | 89.6 | 93.4 | 90.1 | 91.8 |
本文模型 | 92.4 | 90.6 | 91.5 | 93.8 | 91.2 | 92.5 |
1 | 刘峤,李杨,段宏,等.知识图谱构建技术综述[J].计算机研究与发展, 2016, 53(3): 582-600. 10.7544/issn1000-1239.2016.20148228 |
LIU Q, LI Y, DUAN H, et al. Knowledge graph construction techniques[J]. Journal of Computer Research and Development, 2016, 53(3): 582-600. 10.7544/issn1000-1239.2016.20148228 | |
2 | YANG B S, CARDIE C. Joint inference for fine-grained opinion extraction[C]// Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, PA: ACL, 2013: 1640-1649. |
3 | MIWA M, BANSAL M. End-to-end relation extraction using LSTMs on sequences and tree structures[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, PA: ACL, 2016: 1105-1116. 10.18653/v1/p16-1105 |
4 | KATIYAR A, CARDIE C. Going out on a limb: joint extraction of entity mentions and relations without dependency trees[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, PA: ACL, 2017: 917-928. 10.18653/v1/p17-1085 |
5 | ZENG X R, ZENG D J, HE S Z, et al. Extracting relational facts by an end-to-end neural model with copy mechanism[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, PA: ACL, 2018: 506-514. 10.18653/v1/p18-1047 |
6 | TAKANOBU R, ZHANG T Y, LIU J X, et al. A hierarchical framework for relation extraction with reinforcement learning[C]// Proceedings of the 33rd AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2019: 7072-7079. 10.1609/aaai.v33i01.33017072 |
7 | WEI Z P, SU J L, WANG Y, et al. A novel cascade binary tagging framework for relational triple extraction[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: ACL, 2020: 1476-1488. 10.18653/v1/2020.acl-main.136 |
8 | CHAN Y S, ROTH D. Exploiting syntactico-semantic structures for relation extraction[C]// Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg, PA: ACL, 2011: 551-560. |
9 | LI Q, JI H. Incremental joint extraction of entity mentions and relations[C]// Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, PA: ACL, 2014: 402-412. 10.3115/v1/p14-1038 |
10 | YU X F, LAM W. Jointly identifying entities and extracting relations in encyclopedia text via a graphical model approach[C]// Proceedings of the 23rd International Conference on Computational Linguistics: Posters Volume. [S.l.]: Coling 2010 Organizing Committee, 2010: 1399-1407. |
11 | MIWA M, SASAKI Y. Modeling joint entity and relation extraction with table representation[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA: ACL, 2014: 1858-1869. 10.3115/v1/d14-1200 |
12 | KATIYAR A, CARDIE C. Investigating LSTMs for joint extraction of opinion entities and relations[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, PA: ACL, 2016: 919-929. 10.18653/v1/p16-1087 |
13 | ZHENG S C, WANG F, BAO H Y, et al. Joint extraction of entities and relations based on a novel tagging scheme[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, PA: ACL, 2017: 1227-1236. 10.18653/v1/p17-1113 |
14 | DAI D, XIAO X Y, LYU Y J, et al. Joint extraction of entities and overlapping relations using position-attentive sequence labeling[C]// Proceedings of the 33rd AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2019: 6300-6308. 10.1609/aaai.v33i01.33016300 |
15 | YU B W, ZHANG Z Y, SU X B, et al. Joint extraction of entities and relations based on a novel decomposition strategy[C]// Proceedings of the 24th European Conference on Artificial Intelligence. Amsterdam: IOS Press, 2020: 2282-2289. |
16 | FU T J, LI P H, MA W Y. GraphRel: modeling text as relational graphs for joint entity and relation extraction[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: ACL, 2019: 1409-1418. 10.18653/v1/p19-1136 |
17 | ZENG D J, LIU K, CHEN Y B, et al. Distant supervision for relation extraction via piecewise convolutional neural networks[C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA: ACL, 2015: 1753-1762. 10.18653/v1/d15-1203 |
18 | ZENG X R, HE S Z, ZENG D J, et al. Learning the extraction order of multiple relational facts in a sentence with reinforcement learning[C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Stroudsburg, PA: ACL, 2019: 367-377. 10.18653/v1/d19-1035 |
19 | ZENG D J, ZHANG H R, LIU Q Y. CopyMTL: copy mechanism for joint extraction of entities and relations with multi-task learning[C]// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2020: 9507-9514. 10.1609/aaai.v34i05.6495 |
20 | NAYAK T, NG H T. Effective modeling of encoder-decoder architecture for joint entity and relation extraction[C]// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2020: 8528-8535. 10.1609/aaai.v34i05.6374 |
21 | DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1(Long and Short Papers). Stroudsburg, PA: ACL, 2019: 4171-4186. 10.18653/v1/n18-2 |
22 | WANG G Y, LI C Y, WANG W L, et al. Joint embedding of words and labels for text classification[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, PA: ACL, 2018: 2321-2331. 10.18653/v1/p18-1216 |
23 | DE VRIES H, STRUB F, MARY J, et al. Modulating early visual processing by language[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook, NY: Curran Associates Inc., 2017: 6597-6607. |
24 | RIEDEL S, YAO L M, McCALLUM A. Modeling relations and their mentions without labeled text[C]// Proceedings of the 2010 Joint European Conference on Machine Learning and Knowledge Discovery in Databases, LNCS 6323. Berlin: Springer, 2010: 148-163. 10.5715/jnlp.4.3_1 |
25 | GARDENT C, SHIMORINA A, NARAYAN S, et al. Creating training corpora for NLG micro-planning[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, PA: ACL, 2017: 179-188. 10.18653/v1/p17-1017 |
26 | WANG Y, YU B, ZHANG Y, et al. TPLinker: single-stage joint extraction of entities and relations through token pair linking[C]// Proceedings of the 28th International Conference on Computational Linguistics. Barcelona, Spain: International Committee on Computational Linguistics, 2020: 1572-1582. 10.18653/v1/2020.coling-main.138 |
[1] | 秦璟, 秦志光, 李发礼, 彭悦恒. 基于概率稀疏自注意力神经网络的重性抑郁疾患诊断[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2970-2974. |
[2] | 王熙源, 张战成, 徐少康, 张宝成, 罗晓清, 胡伏原. 面向手术导航3D/2D配准的无监督跨域迁移网络[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2911-2918. |
[3] | 李力铤, 华蓓, 贺若舟, 徐况. 基于解耦注意力机制的多变量时序预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2732-2738. |
[4] | 潘烨新, 杨哲. 基于多级特征双向融合的小目标检测优化模型[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2871-2877. |
[5] | 赵志强, 马培红, 黑新宏. 基于双重注意力机制的人群计数方法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2886-2892. |
[6] | 李顺勇, 李师毅, 胥瑞, 赵兴旺. 基于自注意力融合的不完整多视图聚类算法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2696-2703. |
[7] | 黄云川, 江永全, 黄骏涛, 杨燕. 基于元图同构网络的分子毒性预测[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2964-2969. |
[8] | 薛凯鹏, 徐涛, 廖春节. 融合自监督和多层交叉注意力的多模态情感分析网络[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2387-2392. |
[9] | 汪雨晴, 朱广丽, 段文杰, 李书羽, 周若彤. 基于交互注意力机制的心理咨询文本情感分类模型[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2393-2399. |
[10] | 高鹏淇, 黄鹤鸣, 樊永红. 融合坐标与多头注意力机制的交互语音情感识别[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2400-2406. |
[11] | 刘禹含, 吉根林, 张红苹. 基于骨架图与混合注意力的视频行人异常检测方法[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2551-2557. |
[12] | 李钟华, 白云起, 王雪津, 黄雷雷, 林初俊, 廖诗宇. 基于图像增强的低照度人脸检测[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2588-2594. |
[13] | 莫尚斌, 王文君, 董凌, 高盛祥, 余正涛. 基于多路信息聚合协同解码的单通道语音增强[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2611-2617. |
[14] | 顾焰杰, 张英俊, 刘晓倩, 周围, 孙威. 基于时空多图融合的交通流量预测[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2618-2625. |
[15] | 石乾宏, 杨燕, 江永全, 欧阳小草, 范武波, 陈强, 姜涛, 李媛. 面向空气质量预测的多粒度突变拟合网络[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2643-2650. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||