《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (7): 2116-2124.DOI: 10.11772/j.issn.1001-9081.2022060846
• 人工智能 • 上一篇
收稿日期:
2022-06-13
修回日期:
2022-09-05
接受日期:
2022-09-06
发布日期:
2022-09-22
出版日期:
2023-07-10
通讯作者:
薛涛
作者简介:
拓雨欣(1998—),女,陕西西安人,硕士研究生,主要研究方向:知识图谱、关系抽取;基金资助:
Received:
2022-06-13
Revised:
2022-09-05
Accepted:
2022-09-06
Online:
2022-09-22
Published:
2023-07-10
Contact:
Tao XUE
About author:
TUO Yuxin, born in 1998, M. S. candidate. Her research interests include knowledge graph, relation extraction.Supported by:
摘要:
针对自然语言文本中实体重叠情况复杂、多个关系三元组提取困难的问题,提出一种融合指针网络与关系嵌入的三元组联合抽取模型。首先利用BERT (Bidirectional Encoder Representations from Transformers)预训练模型对输入句子进行编码表示;然后利用首尾指针标注抽取句子中的所有主体,并采用主体和关系引导的注意力机制来区分不同关系标签对每个单词的重要程度,从而将关系标签信息加入句子嵌入中;最后针对主体及每一种关系利用指针标注和级联结构抽取出相应的客体,并生成关系三元组。在纽约时报(NYT)和网络自然文本生成(WebNLG)两个数据集上进行了大量实验,结果表明,所提模型相较于目前最优的级联二元标记框架(CasRel)模型,整体性能分别提升了1.9和0.7个百分点;与基于跨度的提取标记方法(ETL-Span)模型相比,在含有1~5个三元组的对比实验中分别取得了大于6.0%和大于3.7%的性能提升,特别是在含有5个以上三元组的复杂句子中,所提模型的F1值分别提升了8.5和1.3个百分点,且在捕获更多实体对的同时能够保持稳定的提取能力,进一步验证了该模型在三元组重叠问题中的有效性。
中图分类号:
拓雨欣, 薛涛. 融合指针网络与关系嵌入的三元组联合抽取模型[J]. 计算机应用, 2023, 43(7): 2116-2124.
Yuxin TUO, Tao XUE. Joint triple extraction model combining pointer network and relational embedding[J]. Journal of Computer Applications, 2023, 43(7): 2116-2124.
统计项 | NYT的样本数 | WebNLG的样本数 | ||
---|---|---|---|---|
训练集 | 测试集 | 训练集 | 测试集 | |
Normal | 37 013 | 3 266 | 1 596 | 246 |
EPO | 9 782 | 978 | 227 | 26 |
SEO | 14 735 | 1 297 | 3 406 | 457 |
表1 两个数据集的统计信息
Tab. 1 Statistics of two datasets
统计项 | NYT的样本数 | WebNLG的样本数 | ||
---|---|---|---|---|
训练集 | 测试集 | 训练集 | 测试集 | |
Normal | 37 013 | 3 266 | 1 596 | 246 |
EPO | 9 782 | 978 | 227 | 26 |
SEO | 14 735 | 1 297 | 3 406 | 457 |
模型 | NYT | WebNLG | ||||
---|---|---|---|---|---|---|
P | R | F1 | P | R | F1 | |
NovelTagging | 62.4 | 31.7 | 42.0 | 52.5 | 19.3 | 28.3 |
CopyREOne | 59.4 | 53.1 | 56.0 | 32.2 | 28.9 | 30.5 |
CopyREMul | 61.0 | 56.6 | 58.7 | 37.7 | 36.4 | 37.1 |
GraphRel1p | 62.9 | 57.3 | 60.0 | 42.3 | 39.2 | 40.7 |
GraphRel2p | 63.9 | 60.0 | 61.9 | 44.7 | 41.1 | 42.9 |
ETL-Span | 84.3 | 82.0 | 83.1 | 84.0 | 91.5 | 87.6 |
CasRel | 89.7 | 89.5 | 89.6 | 93.4 | 90.1 | 91.8 |
本文模型 | 92.4 | 90.6 | 91.5 | 93.8 | 91.2 | 92.5 |
表2 不同模型在实验数据集上的性能对比 ( %)
Tab. 2 Performance comparison of different models on experimental datasets
模型 | NYT | WebNLG | ||||
---|---|---|---|---|---|---|
P | R | F1 | P | R | F1 | |
NovelTagging | 62.4 | 31.7 | 42.0 | 52.5 | 19.3 | 28.3 |
CopyREOne | 59.4 | 53.1 | 56.0 | 32.2 | 28.9 | 30.5 |
CopyREMul | 61.0 | 56.6 | 58.7 | 37.7 | 36.4 | 37.1 |
GraphRel1p | 62.9 | 57.3 | 60.0 | 42.3 | 39.2 | 40.7 |
GraphRel2p | 63.9 | 60.0 | 61.9 | 44.7 | 41.1 | 42.9 |
ETL-Span | 84.3 | 82.0 | 83.1 | 84.0 | 91.5 | 87.6 |
CasRel | 89.7 | 89.5 | 89.6 | 93.4 | 90.1 | 91.8 |
本文模型 | 92.4 | 90.6 | 91.5 | 93.8 | 91.2 | 92.5 |
数据集 | 模型 | 参数量/106 | 每轮平均训练时间/s | 每条平均推理时间/ms |
---|---|---|---|---|
NYT | CasRel | 107.7 | 1 947.4 | 54.0 |
TPLinker[ | 109.6 | 1 771.9 | 14.2 | |
本文模型 | 102.9 | 926.4 | 34.8 | |
WebNLG | CasRel | 107.9 | 901.6 | 76.8 |
TPLinker[ | 110.2 | 845.4 | 18.0 | |
本文模型 | 103.1 | 81.5 | 42.3 |
表3 在NYT和WebNLG数据集上计算效率的比较
Tab. 3 Computational efficiency comparison on NYT and WebNLG datasets
数据集 | 模型 | 参数量/106 | 每轮平均训练时间/s | 每条平均推理时间/ms |
---|---|---|---|---|
NYT | CasRel | 107.7 | 1 947.4 | 54.0 |
TPLinker[ | 109.6 | 1 771.9 | 14.2 | |
本文模型 | 102.9 | 926.4 | 34.8 | |
WebNLG | CasRel | 107.9 | 901.6 | 76.8 |
TPLinker[ | 110.2 | 845.4 | 18.0 | |
本文模型 | 103.1 | 81.5 | 42.3 |
模型 | NYT | WebNLG | ||||
---|---|---|---|---|---|---|
P | R | F1 | P | R | F1 | |
本文模型 | 92.4 | 90.6 | 91.5 | 93.8 | 91.2 | 92.5 |
本文模型-SRGA | 90.3 | 89.2 | 89.7 | 91.3 | 90.5 | 90.9 |
本文模型-CLN | 92.1 | 88.4 | 90.2 | 93.2 | 90.5 | 91.8 |
表4 两个数据集上的消融实验结果 (%)
Tab. 4 Ablation experimental results on two datasets
模型 | NYT | WebNLG | ||||
---|---|---|---|---|---|---|
P | R | F1 | P | R | F1 | |
本文模型 | 92.4 | 90.6 | 91.5 | 93.8 | 91.2 | 92.5 |
本文模型-SRGA | 90.3 | 89.2 | 89.7 | 91.3 | 90.5 | 90.9 |
本文模型-CLN | 92.1 | 88.4 | 90.2 | 93.2 | 90.5 | 91.8 |
例句 | CasRel抽取的三元组 | 本文模型抽取的三元组 |
---|---|---|
He trained for about six months, he said, running from his house on East 23rd Street in Midwood, Brooklyn, to the Coney Island boardwalk and back, he said | (Brooklyn, contains, Midwood) (Island, neighborhood of, Brooklyn) (Midwood, neighborhood, of, Brooklyn) | (Brooklyn, contains, Midwood) (Brooklyn, contains, Island) (Midwood, neighborhood of, Brooklyn) (Island, neighborhood of, Brooklyn) |
表5 来自NYT数据集的样例测试结果
Tab. 5 Test results of one sample from NYT dataset
例句 | CasRel抽取的三元组 | 本文模型抽取的三元组 |
---|---|---|
He trained for about six months, he said, running from his house on East 23rd Street in Midwood, Brooklyn, to the Coney Island boardwalk and back, he said | (Brooklyn, contains, Midwood) (Island, neighborhood of, Brooklyn) (Midwood, neighborhood, of, Brooklyn) | (Brooklyn, contains, Midwood) (Brooklyn, contains, Island) (Midwood, neighborhood of, Brooklyn) (Island, neighborhood of, Brooklyn) |
元素 | 模型 | NYT | WebNLG | ||||
---|---|---|---|---|---|---|---|
P | R | F1 | P | R | F1 | ||
s | CasRel[ | 94.6 | 92.4 | 93.5 | 98.7 | 92.8 | 95.7 |
本文模型 | 95.1 | 93.7 | 94.4 | 98.1 | 94.3 | 96.2 | |
(s, r) | CasRel[ | 93.6 | 90.9 | 92.2 | 94.8 | 90.3 | 92.5 |
本文模型 | 93.9 | 92.8 | 93.3 | 93.5 | 92.7 | 93.1 | |
(s, r, o) | CasRel[ | 89.7 | 89.5 | 89.6 | 93.4 | 90.1 | 91.8 |
本文模型 | 92.4 | 90.6 | 91.5 | 93.8 | 91.2 | 92.5 |
表6 两个数据集上关系三元组元素抽取的结果 ( %)
Tab. 6 Results of element extraction of relational triples on two datasets
元素 | 模型 | NYT | WebNLG | ||||
---|---|---|---|---|---|---|---|
P | R | F1 | P | R | F1 | ||
s | CasRel[ | 94.6 | 92.4 | 93.5 | 98.7 | 92.8 | 95.7 |
本文模型 | 95.1 | 93.7 | 94.4 | 98.1 | 94.3 | 96.2 | |
(s, r) | CasRel[ | 93.6 | 90.9 | 92.2 | 94.8 | 90.3 | 92.5 |
本文模型 | 93.9 | 92.8 | 93.3 | 93.5 | 92.7 | 93.1 | |
(s, r, o) | CasRel[ | 89.7 | 89.5 | 89.6 | 93.4 | 90.1 | 91.8 |
本文模型 | 92.4 | 90.6 | 91.5 | 93.8 | 91.2 | 92.5 |
1 | 刘峤,李杨,段宏,等.知识图谱构建技术综述[J].计算机研究与发展, 2016, 53(3): 582-600. 10.7544/issn1000-1239.2016.20148228 |
LIU Q, LI Y, DUAN H, et al. Knowledge graph construction techniques[J]. Journal of Computer Research and Development, 2016, 53(3): 582-600. 10.7544/issn1000-1239.2016.20148228 | |
2 | YANG B S, CARDIE C. Joint inference for fine-grained opinion extraction[C]// Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, PA: ACL, 2013: 1640-1649. |
3 | MIWA M, BANSAL M. End-to-end relation extraction using LSTMs on sequences and tree structures[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, PA: ACL, 2016: 1105-1116. 10.18653/v1/p16-1105 |
4 | KATIYAR A, CARDIE C. Going out on a limb: joint extraction of entity mentions and relations without dependency trees[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, PA: ACL, 2017: 917-928. 10.18653/v1/p17-1085 |
5 | ZENG X R, ZENG D J, HE S Z, et al. Extracting relational facts by an end-to-end neural model with copy mechanism[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, PA: ACL, 2018: 506-514. 10.18653/v1/p18-1047 |
6 | TAKANOBU R, ZHANG T Y, LIU J X, et al. A hierarchical framework for relation extraction with reinforcement learning[C]// Proceedings of the 33rd AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2019: 7072-7079. 10.1609/aaai.v33i01.33017072 |
7 | WEI Z P, SU J L, WANG Y, et al. A novel cascade binary tagging framework for relational triple extraction[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: ACL, 2020: 1476-1488. 10.18653/v1/2020.acl-main.136 |
8 | CHAN Y S, ROTH D. Exploiting syntactico-semantic structures for relation extraction[C]// Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg, PA: ACL, 2011: 551-560. |
9 | LI Q, JI H. Incremental joint extraction of entity mentions and relations[C]// Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, PA: ACL, 2014: 402-412. 10.3115/v1/p14-1038 |
10 | YU X F, LAM W. Jointly identifying entities and extracting relations in encyclopedia text via a graphical model approach[C]// Proceedings of the 23rd International Conference on Computational Linguistics: Posters Volume. [S.l.]: Coling 2010 Organizing Committee, 2010: 1399-1407. |
11 | MIWA M, SASAKI Y. Modeling joint entity and relation extraction with table representation[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA: ACL, 2014: 1858-1869. 10.3115/v1/d14-1200 |
12 | KATIYAR A, CARDIE C. Investigating LSTMs for joint extraction of opinion entities and relations[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, PA: ACL, 2016: 919-929. 10.18653/v1/p16-1087 |
13 | ZHENG S C, WANG F, BAO H Y, et al. Joint extraction of entities and relations based on a novel tagging scheme[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, PA: ACL, 2017: 1227-1236. 10.18653/v1/p17-1113 |
14 | DAI D, XIAO X Y, LYU Y J, et al. Joint extraction of entities and overlapping relations using position-attentive sequence labeling[C]// Proceedings of the 33rd AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2019: 6300-6308. 10.1609/aaai.v33i01.33016300 |
15 | YU B W, ZHANG Z Y, SU X B, et al. Joint extraction of entities and relations based on a novel decomposition strategy[C]// Proceedings of the 24th European Conference on Artificial Intelligence. Amsterdam: IOS Press, 2020: 2282-2289. |
16 | FU T J, LI P H, MA W Y. GraphRel: modeling text as relational graphs for joint entity and relation extraction[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: ACL, 2019: 1409-1418. 10.18653/v1/p19-1136 |
17 | ZENG D J, LIU K, CHEN Y B, et al. Distant supervision for relation extraction via piecewise convolutional neural networks[C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA: ACL, 2015: 1753-1762. 10.18653/v1/d15-1203 |
18 | ZENG X R, HE S Z, ZENG D J, et al. Learning the extraction order of multiple relational facts in a sentence with reinforcement learning[C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Stroudsburg, PA: ACL, 2019: 367-377. 10.18653/v1/d19-1035 |
19 | ZENG D J, ZHANG H R, LIU Q Y. CopyMTL: copy mechanism for joint extraction of entities and relations with multi-task learning[C]// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2020: 9507-9514. 10.1609/aaai.v34i05.6495 |
20 | NAYAK T, NG H T. Effective modeling of encoder-decoder architecture for joint entity and relation extraction[C]// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2020: 8528-8535. 10.1609/aaai.v34i05.6374 |
21 | DEVLIN J, CHANG M W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1(Long and Short Papers). Stroudsburg, PA: ACL, 2019: 4171-4186. 10.18653/v1/n18-2 |
22 | WANG G Y, LI C Y, WANG W L, et al. Joint embedding of words and labels for text classification[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, PA: ACL, 2018: 2321-2331. 10.18653/v1/p18-1216 |
23 | DE VRIES H, STRUB F, MARY J, et al. Modulating early visual processing by language[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook, NY: Curran Associates Inc., 2017: 6597-6607. |
24 | RIEDEL S, YAO L M, McCALLUM A. Modeling relations and their mentions without labeled text[C]// Proceedings of the 2010 Joint European Conference on Machine Learning and Knowledge Discovery in Databases, LNCS 6323. Berlin: Springer, 2010: 148-163. 10.5715/jnlp.4.3_1 |
25 | GARDENT C, SHIMORINA A, NARAYAN S, et al. Creating training corpora for NLG micro-planning[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, PA: ACL, 2017: 179-188. 10.18653/v1/p17-1017 |
26 | WANG Y, YU B, ZHANG Y, et al. TPLinker: single-stage joint extraction of entities and relations through token pair linking[C]// Proceedings of the 28th International Conference on Computational Linguistics. Barcelona, Spain: International Committee on Computational Linguistics, 2020: 1572-1582. 10.18653/v1/2020.coling-main.138 |
[1] | 岑黎彬, 李靖东, 林淳波, 王晓玲. 基于深度自回归模型的近似查询处理方法[J]. 《计算机应用》唯一官方网站, 2023, 43(7): 2034-2039. |
[2] | 魏远, 林彦, 郭晟楠, 林友芳, 万怀宇. 融合出发地与目的地时空相关性的城市区域间出租车需求预测[J]. 《计算机应用》唯一官方网站, 2023, 43(7): 2100-2106. |
[3] | 秦静, 马雪倩, 高福杰, 季长清, 汪祖民. 基于步态分析的帕金森病辅助诊断方法综述[J]. 《计算机应用》唯一官方网站, 2023, 43(6): 1687-1695. |
[4] | 张奕, 王真梅. 图自动编码器上二阶段融合实现的环状RNA-疾病关联预测[J]. 《计算机应用》唯一官方网站, 2023, 43(6): 1979-1986. |
[5] | 陈一驰, 陈斌. 计算机视觉中的终身学习综述[J]. 《计算机应用》唯一官方网站, 2023, 43(6): 1785-1795. |
[6] | 方可, 刘蓉, 魏驰宇, 张心月, 刘杨. 复杂场景下的行人跌倒检测算法[J]. 《计算机应用》唯一官方网站, 2023, 43(6): 1811-1817. |
[7] | 鲁斌, 柳杰林. 基于特征增强的三维点云语义分割[J]. 《计算机应用》唯一官方网站, 2023, 43(6): 1818-1825. |
[8] | 靳鑫, 刘仰川, 朱叶晨, 张子健, 高欣. 基于残差编解码-生成对抗网络的正弦图修复的稀疏角度锥束CT图像重建[J]. 《计算机应用》唯一官方网站, 2023, 43(6): 1950-1957. |
[9] | 董润婷, 吴利, 王晓英, 曹腾飞, 黄建强, 管琴, 吴洁瑕. 深度学习在天气预报领域的应用分析及研究进展综述[J]. 《计算机应用》唯一官方网站, 2023, 43(6): 1958-1968. |
[10] | 张慧斌, 冯丽萍, 郝耀军, 王一宁. 基于注意力机制和迁移学习的古壁画朝代识别[J]. 《计算机应用》唯一官方网站, 2023, 43(6): 1826-1832. |
[11] | 郑智雄, 刘建华, 孙水华, 徐戈, 林鸿辉. 融合多窗口局部信息的方面级情感分析模型[J]. 《计算机应用》唯一官方网站, 2023, 43(6): 1796-1802. |
[12] | 王辉, 李建红. 基于Transformer的三维模型小样本识别方法[J]. 《计算机应用》唯一官方网站, 2023, 43(6): 1750-1758. |
[13] | 陈林颖, 刘建华, 孙水华, 郑智雄, 林鸿辉, 林杰. 面向方面的自适应跨度特征的细粒度意见元组提取[J]. 《计算机应用》唯一官方网站, 2023, 43(5): 1454-1460. |
[14] | 丁正凯, 傅启明, 陈建平, 陆悠, 吴宏杰, 方能炜, 邢镔. 结合注意力机制与深度强化学习的超短期光伏功率预测[J]. 《计算机应用》唯一官方网站, 2023, 43(5): 1647-1654. |
[15] | 刘辉, 张琳玉, 王复港, 何如瑾. 基于注意力机制和上下文信息的目标检测算法[J]. 《计算机应用》唯一官方网站, 2023, 43(5): 1557-1564. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||