Journal of Computer Applications ›› 2024, Vol. 44 ›› Issue (7): 2018-2025.DOI: 10.11772/j.issn.1001-9081.2023071051
• Artificial intelligence • Previous Articles Next Articles
Dianhui MAO1,2, Xuebo LI1, Junling LIU1, Denghui ZHANG1, Wenjing YAN2()
Received:
2023-08-03
Revised:
2023-09-16
Accepted:
2023-09-21
Online:
2023-10-26
Published:
2024-07-10
Contact:
Wenjing YAN
About author:
MAO Dianhui, born in 1979, Ph. D., professor. His research interests include blockchain, smart financial technology and food safety, deep learning.Supported by:
毛典辉1,2, 李学博1, 刘峻岭1, 张登辉1, 颜文婧2()
通讯作者:
颜文婧
作者简介:
毛典辉(1979—),男,湖北浠水人,教授,博士,主要研究方向:区块链、智能金融科技和食品安全、深度学习;基金资助:
CLC Number:
Dianhui MAO, Xuebo LI, Junling LIU, Denghui ZHANG, Wenjing YAN. Chinese entity and relation extraction model based on parallel heterogeneous graph and sequential attention mechanism[J]. Journal of Computer Applications, 2024, 44(7): 2018-2025.
毛典辉, 李学博, 刘峻岭, 张登辉, 颜文婧. 基于并行异构图和序列注意力机制的中文实体关系抽取模型[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2018-2025.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2023071051
数据集 | 训练集样本数 | 测试集样本数 | 关系类型数 |
---|---|---|---|
NYT | 61 194 | 5 000 | 24 |
WebNLG | 5 519 | 703 | 171 |
CMeIE | 14 339 | 3 585 | 53 |
DuIE | 18 606 | 2 067 | 48 |
Tab. 1 Statistics of dataset size
数据集 | 训练集样本数 | 测试集样本数 | 关系类型数 |
---|---|---|---|
NYT | 61 194 | 5 000 | 24 |
WebNLG | 5 519 | 703 | 171 |
CMeIE | 14 339 | 3 585 | 53 |
DuIE | 18 606 | 2 067 | 48 |
重叠三元类型 | NYT | WebNLG | CMeIE | DuIE | ||||
---|---|---|---|---|---|---|---|---|
训练集样本数 | 测试集样本数 | 训练集样本数 | 测试集样本数 | 训练集样本数 | 测试集样本数 | 训练集样本数 | 测试集样本数 | |
合计 | 61 194 | 5 000 | 5 519 | 703 | 14 339 | 3 585 | 18 606 | 2 067 |
Normal | 40 718 | 3 266 | 1 930 | 246 | 5 508 | 1 425 | 11 391 | 1 274 |
EPO | 10 631 | 978 | 243 | 26 | 189 | 40 | 722 | 83 |
SEO | 9 845 | 1 297 | 3 346 | 457 | 8 642 | 2 120 | 6 493 | 710 |
Tab. 2 Statistics of triple types in datasets
重叠三元类型 | NYT | WebNLG | CMeIE | DuIE | ||||
---|---|---|---|---|---|---|---|---|
训练集样本数 | 测试集样本数 | 训练集样本数 | 测试集样本数 | 训练集样本数 | 测试集样本数 | 训练集样本数 | 测试集样本数 | |
合计 | 61 194 | 5 000 | 5 519 | 703 | 14 339 | 3 585 | 18 606 | 2 067 |
Normal | 40 718 | 3 266 | 1 930 | 246 | 5 508 | 1 425 | 11 391 | 1 274 |
EPO | 10 631 | 978 | 243 | 26 | 189 | 40 | 722 | 83 |
SEO | 9 845 | 1 297 | 3 346 | 457 | 8 642 | 2 120 | 6 493 | 710 |
模型 | 权重衰减率 | 批量大小 | 学习率 | 训练次数 | 丢失率 | 词嵌入维度 | 关系嵌入维度 | 句子长度 | 优化器 |
---|---|---|---|---|---|---|---|---|---|
CasRel[ | — | 6 | 1.00×10-5 | 150 | — | 768 | 768 | 150/200/300 | Adam |
RIFRE[ | 1.00×10-5 | 6 | 1.00×10-1 | 150 | — | 768 | 768 | 150/200/300 | SGD |
HNNERJE | 1.00×10-5 | 6 | 1.00×10-1 | 150 | 0.5 | 768 | 768 | 150/200/300 | SGD |
Tab. 3 Details of experimental parameters
模型 | 权重衰减率 | 批量大小 | 学习率 | 训练次数 | 丢失率 | 词嵌入维度 | 关系嵌入维度 | 句子长度 | 优化器 |
---|---|---|---|---|---|---|---|---|---|
CasRel[ | — | 6 | 1.00×10-5 | 150 | — | 768 | 768 | 150/200/300 | Adam |
RIFRE[ | 1.00×10-5 | 6 | 1.00×10-1 | 150 | — | 768 | 768 | 150/200/300 | SGD |
HNNERJE | 1.00×10-5 | 6 | 1.00×10-1 | 150 | 0.5 | 768 | 768 | 150/200/300 | SGD |
模型 | NYT | WebNLG | CMeIE | DuIE | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
F1 | 精确率 | 召回率 | F1 | 精确率 | 召回率 | F1 | 精确率 | 召回率 | F1 | 精确率 | 召回率 | |
CasRel[ | 89.60 | 89.70 | 89.50 | 91.80 | 93.40 | 90.10 | — | — | — | — | — | — |
RIFRE[ | 92.00 | 93.60 | 90.50 | 92.60 | 93.30 | 92.00 | — | — | — | — | — | — |
CasRel* | 89.96 | 88.79 | 89.38 | 91.15 | 92.19 | 91.11 | 44.69 | 44.38 | 45.01 | 66.35 | 66.91 | 65.80 |
RIFRE* | 91.78 | 91.63 | 91.94 | 92.38 | 92.75 | 92.00 | 45.96 | 51.47 | 41.52 | 66.96 | 68.92 | 65.11 |
91.60 | 91.68 | 91.51 | 92.45 | 93.00 | 91.91 | 47.12 | 47.59 | 46.55 | 66.41 | 64.43 | 68.21 | |
HNNERJE | 92.17 | 92.96 | 91.48 | 93.42 | 93.13 | 93.71 | 47.40 | 48.49 | 46.37 | 67.98 | 71.35 | 67.91 |
Tab. 4 Comparison of experimental results of different models on WebNLG, NYT, CMeIE and DuIE datasets
模型 | NYT | WebNLG | CMeIE | DuIE | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
F1 | 精确率 | 召回率 | F1 | 精确率 | 召回率 | F1 | 精确率 | 召回率 | F1 | 精确率 | 召回率 | |
CasRel[ | 89.60 | 89.70 | 89.50 | 91.80 | 93.40 | 90.10 | — | — | — | — | — | — |
RIFRE[ | 92.00 | 93.60 | 90.50 | 92.60 | 93.30 | 92.00 | — | — | — | — | — | — |
CasRel* | 89.96 | 88.79 | 89.38 | 91.15 | 92.19 | 91.11 | 44.69 | 44.38 | 45.01 | 66.35 | 66.91 | 65.80 |
RIFRE* | 91.78 | 91.63 | 91.94 | 92.38 | 92.75 | 92.00 | 45.96 | 51.47 | 41.52 | 66.96 | 68.92 | 65.11 |
91.60 | 91.68 | 91.51 | 92.45 | 93.00 | 91.91 | 47.12 | 47.59 | 46.55 | 66.41 | 64.43 | 68.21 | |
HNNERJE | 92.17 | 92.96 | 91.48 | 93.42 | 93.13 | 93.71 | 47.40 | 48.49 | 46.37 | 67.98 | 71.35 | 67.91 |
模型 | 三元组数N | 不同数据集上的F1/% | |||
---|---|---|---|---|---|
NYT | WebNLG | CMeIE | DuIE | ||
CasRel[ | 1 | 88.20 | 89.30 | — | — |
2 | 90.30 | 90.80 | — | ||
3 | 91.90 | 94.20 | — | — | |
4 | 94.20 | 92.40 | — | — | |
≥5 | 83.70 | 90.90 | — | — | |
RIFRE[ | 1 | 90.70 | 90.20 | — | — |
2 | 92.80 | 92.00 | — | — | |
3 | 93.40 | 94.80 | — | — | |
4 | 94.80 | 93.00 | — | — | |
≥5 | 89.60 | 92.00 | — | — | |
CasRel* | 1 | 88.25 | 88.85 | 32.84 | 65.31 |
2 | 90.83 | 90.59 | 39.81 | 66.16 | |
3 | 92.58 | 93.83 | 42.47 | 67.12 | |
4 | 94.46 | 92.16 | 49.17 | 69.56 | |
≥5 | 83.48 | 90.01 | 47.12 | 65.31 | |
RIFRE* | 1 | 90.20 | 89.02 | 35.30 | 65.51 |
2 | 92.43 | 91.59 | 41.85 | 66.50 | |
3 | 92.96 | 94.41 | 43.26 | 66.87 | |
4 | 95.06 | 93.35 | 50.40 | 67.99 | |
≥5 | 89.63 | 91.95 | 50.16 | 68.61 | |
HNNERJE | 1 | 90.92 | 90.24 | 35.99 | 66.95 |
2 | 93.04 | 92.17 | 42.68 | 68.05 | |
3 | 93.50 | 95.06 | 45.26 | 67.70 | |
4 | 95.90 | 94.44 | 51.39 | 69.65 | |
≥5 | 90.28 | 92.38 | 52.58 | 69.08 |
Tab. 5 F1 scores of extracting triple from sentences with different number of triple
模型 | 三元组数N | 不同数据集上的F1/% | |||
---|---|---|---|---|---|
NYT | WebNLG | CMeIE | DuIE | ||
CasRel[ | 1 | 88.20 | 89.30 | — | — |
2 | 90.30 | 90.80 | — | ||
3 | 91.90 | 94.20 | — | — | |
4 | 94.20 | 92.40 | — | — | |
≥5 | 83.70 | 90.90 | — | — | |
RIFRE[ | 1 | 90.70 | 90.20 | — | — |
2 | 92.80 | 92.00 | — | — | |
3 | 93.40 | 94.80 | — | — | |
4 | 94.80 | 93.00 | — | — | |
≥5 | 89.60 | 92.00 | — | — | |
CasRel* | 1 | 88.25 | 88.85 | 32.84 | 65.31 |
2 | 90.83 | 90.59 | 39.81 | 66.16 | |
3 | 92.58 | 93.83 | 42.47 | 67.12 | |
4 | 94.46 | 92.16 | 49.17 | 69.56 | |
≥5 | 83.48 | 90.01 | 47.12 | 65.31 | |
RIFRE* | 1 | 90.20 | 89.02 | 35.30 | 65.51 |
2 | 92.43 | 91.59 | 41.85 | 66.50 | |
3 | 92.96 | 94.41 | 43.26 | 66.87 | |
4 | 95.06 | 93.35 | 50.40 | 67.99 | |
≥5 | 89.63 | 91.95 | 50.16 | 68.61 | |
HNNERJE | 1 | 90.92 | 90.24 | 35.99 | 66.95 |
2 | 93.04 | 92.17 | 42.68 | 68.05 | |
3 | 93.50 | 95.06 | 45.26 | 67.70 | |
4 | 95.90 | 94.44 | 51.39 | 69.65 | |
≥5 | 90.28 | 92.38 | 52.58 | 69.08 |
模型 | 迭代层数 | 不同数据集上的F1/% | |||
---|---|---|---|---|---|
NYT | WebNLG | CMeIE | DuIE | ||
1 | 91.49 | 92.21 | 46.35 | 64.14 | |
2 | 91.51 | 92.36 | 47.12 | 64.77 | |
3 | 91.46 | 92.45 | 46.38 | 66.41 | |
4 | 91.60 | 92.39 | 46.67 | 65.98 | |
HNNERJE | 1 | 92.06 | 92.25 | 47.06 | 66.42 |
2 | 91.68 | 93.42 | 46.75 | 67.98 | |
3 | 92.17 | 93.06 | 47.02 | 65.98 | |
4 | 91.99 | 92.46 | 47.40 | 66.94 |
Tab. 6 F1 scores of HNNERJE model with and without adversarial training in extracting relational triple
模型 | 迭代层数 | 不同数据集上的F1/% | |||
---|---|---|---|---|---|
NYT | WebNLG | CMeIE | DuIE | ||
1 | 91.49 | 92.21 | 46.35 | 64.14 | |
2 | 91.51 | 92.36 | 47.12 | 64.77 | |
3 | 91.46 | 92.45 | 46.38 | 66.41 | |
4 | 91.60 | 92.39 | 46.67 | 65.98 | |
HNNERJE | 1 | 92.06 | 92.25 | 47.06 | 66.42 |
2 | 91.68 | 93.42 | 46.75 | 67.98 | |
3 | 92.17 | 93.06 | 47.02 | 65.98 | |
4 | 91.99 | 92.46 | 47.40 | 66.94 |
1 | CUI L, WU Y, LIU J, et al. Template-based named entity recognition using BART [EB/OL]. (2021-06-03) [2023-09-14]. . |
2 | 李天昊,霍其润,闫跃,等.融合ERNIE和注意力机制的中文关系抽取模型[J].小型微型计算机系统, 2022, 43(6): 1226-1231. |
LI T H, HUO Q R, YAN Y, et al. Chinese relation extraction model based on ERNIE and attention mechanism [J]. Journal of Chinese Computer Systems, 2022, 43(6): 1226-1231. | |
3 | TUO M, YANG W. Review of entity relation extraction [J]. Journal of Intelligent & Fuzzy Systems, 2023, 44(5): 7391-7405. |
4 | GRADENT C, SHIMORINA A, NARAYAN S, et al. Creating training corpora for NLG micro-planning [C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2017: 179-188. |
5 | RIEDEL S, YAO L, McCALLUM A. Modeling relations and their mentions without labeled text [C]// Proceedings of the 2010 European Conference on Machine Learning and Knowledge Discovery in Databases. Berlin: Springer, 2010: 148-163. |
6 | ZHENG S, WANG F, BAO H, et al. Joint extraction of entities and relations based on a novel tagging scheme [C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg: ACL, 2017: 1227-1236. |
7 | ZENG X, ZEND D, HE S, et al. Extracting relational facts by an end-to-end neural model with copy mechanism [C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg: ACL, 2018: 506-514. |
8 | FU T-J, LI P-H, MA W-Y. GraphRel: modeling text as relational graphs for joint entity and relation extraction [C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2019: 1409-1418. |
9 | 翟社平,柏晓夏,张宇航,等.融合依存分析和图注意网络的三元组抽取[J].计算机工程与应用, 2023, 59(12): 148-156. |
ZHAI S P, BAI X X, ZHANG Y H, et al. Triple extraction of combining dependency analysis and graph attention network [J]. Computer Engineering and Applications, 2023, 59(12): 148-156. | |
10 | 朱秀宝,周刚,陈静,等.基于增强序列标注策略的单阶段联合实体关系抽取方法[J].计算机科学, 2023, 50(8): 184-192. |
ZHU X B, ZHOU G, CHEN J, et al. Single-stage joint entity and relation extraction method based on enhanced sequence annotation strategy [J]. Computer Science, 2023, 50(8): 184-192. | |
11 | BEKOULIS G, DELEU J, DEMEESTER T, et al. Joint entity recognition and relation extraction as a multi-head selection problem [J]. Expert Systems with Applications, 2018, 114: 34-45. |
12 | WEI Z, SU J, WANG Y, et al. A novel cascade binary tagging framework for relational triple extraction [C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2020: 1476-1488. |
13 | ZHAO K, XU H, CHENG Y, et al. Representation iterative fusion based on heterogeneous graph neural network for joint entity and relation extraction [J]. Knowledge-Based Systems, 2021, 219: 106888. |
14 | MIWA M, BANSAL M. End-to-end relation extraction using LSTMs on sequences and tree structures [C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg: ACL, 2016: 1105-1116. |
15 | KATIYAR A, CARDIE C. Going out on a limb: joint extraction of entity mentions and relations without dependency trees [C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg: ACL, 2017: 917-928. |
16 | MIWA M, SASAKI Y. Modeling joint entity and relation extraction with table representation [C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2014: 1858-1869. |
17 | GUPTA P, SCHÜTZE H, ANDRASSY B. Table filling multi-task recurrent neural network for joint entity and relation extraction [C]// Proceedings of the 26th International Conference on Computational Linguistics: Technical Papers. Stroudsburg: ACL, 2016: 2537-2547. |
18 | HONG Y, LIU Y, YANG S, et al. Improving graph convolutional networks based on relation-aware attention for end-to-end relation extraction [J]. IEEE Access, 2020, 8: 51315-51323. |
19 | WANG X, JI H, SHI C, et al. Heterogeneous graph attention network [C]// Proceedings of the 2019 World Wide Web Conference. New York: ACM, 2019: 2022-2032. |
20 | CHEN H, HONG P, HAN W, et al. Dialogue relation extraction with document-level heterogeneous graph attention networks [J]. Cognitive Computation, 2023, 15: 793-802. |
21 | KAMBER M E Z N, ESMAEILZADEH A, TAGHVA K. Chemical-gene relation extraction with graph neural networks and BERT encoder [C]// Proceedings of the 2022 International Conference on Innovations in Computing Research. Cham: Springer, 2022: 166-179. |
22 | QIN Y, CARLINI N, COTTRELL G, et al. Imperceptible, robust, and targeted adversarial examples for automatic speech recognition [C]// Proceedings of the 36th International Conference on Machine Learning. New York: PMLR, 2019: 5231-5240. |
23 | CHEN H, LU G, WU X, et al. Joint extraction of entities and relations by adversarial training and mixup data augmentation [C]// Proceedings of the 2021 7th International Conference on Computer and Communications. Piscataway: IEEE, 2021: 1486-1490. |
24 | DEVLIN J, CHANG M-W, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding [EB/OL]. (2018-10-12) [2023-09-14]. . |
25 | HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition [C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 770-778. |
26 | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need [C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2017: 6000-6010. |
27 | LI S J, HE W, SHI Y B, et al. DuIE: a large-scale Chinese dataset for information extraction [C]// Proceedings of the 8th CCF International Conference on Natural Language Processing and Chinese Computing. Cham: Springer, 2019: 791-800. |
28 | GUAN T F, ZAN H Y, ZHOU X B, et al. CMeIE: construction and evaluation of Chinese medical information extraction dataset [C]// Proceedings of the 9th CCF International Conference on Natural Language Processing and Chinese Computing. Cham: Springer, 2020: 270-282. |
[1] | Zhiqiang ZHAO, Peihong MA, Xinhong HEI. Crowd counting method based on dual attention mechanism [J]. Journal of Computer Applications, 2024, 44(9): 2886-2892. |
[2] | Jing QIN, Zhiguang QIN, Fali LI, Yueheng PENG. Diagnosis of major depressive disorder based on probabilistic sparse self-attention neural network [J]. Journal of Computer Applications, 2024, 44(9): 2970-2974. |
[3] | Liting LI, Bei HUA, Ruozhou HE, Kuang XU. Multivariate time series prediction model based on decoupled attention mechanism [J]. Journal of Computer Applications, 2024, 44(9): 2732-2738. |
[4] | Kaipeng XUE, Tao XU, Chunjie LIAO. Multimodal sentiment analysis network with self-supervision and multi-layer cross attention [J]. Journal of Computer Applications, 2024, 44(8): 2387-2392. |
[5] | Pengqi GAO, Heming HUANG, Yonghong FAN. Fusion of coordinate and multi-head attention mechanisms for interactive speech emotion recognition [J]. Journal of Computer Applications, 2024, 44(8): 2400-2406. |
[6] | Zhonghua LI, Yunqi BAI, Xuejin WANG, Leilei HUANG, Chujun LIN, Shiyu LIAO. Low illumination face detection based on image enhancement [J]. Journal of Computer Applications, 2024, 44(8): 2588-2594. |
[7] | Shangbin MO, Wenjun WANG, Ling DONG, Shengxiang GAO, Zhengtao YU. Single-channel speech enhancement based on multi-channel information aggregation and collaborative decoding [J]. Journal of Computer Applications, 2024, 44(8): 2611-2617. |
[8] | Li LIU, Haijin HOU, Anhong WANG, Tao ZHANG. Generative data hiding algorithm based on multi-scale attention [J]. Journal of Computer Applications, 2024, 44(7): 2102-2109. |
[9] | Song XU, Wenbo ZHANG, Yifan WANG. Lightweight video salient object detection network based on spatiotemporal information [J]. Journal of Computer Applications, 2024, 44(7): 2192-2199. |
[10] | Dahai LI, Zhonghua WANG, Zhendong WANG. Dual-branch low-light image enhancement network combining spatial and frequency domain information [J]. Journal of Computer Applications, 2024, 44(7): 2175-2182. |
[11] | Xindong YOU, Yingzi WEN, Xinpeng SHE, Xueqiang LYU. Triplet extraction method for mine electromechanical equipment field [J]. Journal of Computer Applications, 2024, 44(7): 2026-2033. |
[12] | Wenliang WEI, Yangping WANG, Biao YUE, Anzheng WANG, Zhe ZHANG. Deep learning model for infrared and visible image fusion based on illumination weight allocation and attention [J]. Journal of Computer Applications, 2024, 44(7): 2183-2191. |
[13] | Wu XIONG, Congjun CAO, Xuefang SONG, Yunlong SHAO, Xusheng WANG. Handwriting identification method based on multi-scale mixed domain attention mechanism [J]. Journal of Computer Applications, 2024, 44(7): 2225-2232. |
[14] | Huanhuan LI, Tianqiang HUANG, Xuemei DING, Haifeng LUO, Liqing HUANG. Public traffic demand prediction based on multi-scale spatial-temporal graph convolutional network [J]. Journal of Computer Applications, 2024, 44(7): 2065-2072. |
[15] | Xiaolu WANG, Wangfei QIAN. Gait recognition method based on two-branch convolutional network [J]. Journal of Computer Applications, 2024, 44(6): 1965-1971. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||