《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (7): 2001-2008.DOI: 10.11772/j.issn.1001-9081.2021050861
所属专题: 人工智能
收稿日期:
2021-05-25
修回日期:
2021-09-09
接受日期:
2021-10-12
发布日期:
2021-09-09
出版日期:
2022-07-10
通讯作者:
左亚尧
作者简介:
陈皓宇(1995—),男,广东广州人,硕士研究生,主要研究方向:自然语言处理、深度学习基金资助:
Yayao ZUO(), Haoyu CHEN, Zhiran CHEN, Jiawei HONG, Kun CHEN
Received:
2021-05-25
Revised:
2021-09-09
Accepted:
2021-10-12
Online:
2021-09-09
Published:
2022-07-10
Contact:
Yayao ZUO
About author:
CHEN Haoyu, born in 1995, M.S. candidate. His research interests include natural language processing, deep learning.Supported by:
摘要:
针对语言普遍存在的字符间非线性关系,为捕获更丰富的语义特征,提出了一种基于图卷积神经网络(GCN)和自注意力机制的命名实体识别(NER)方法。首先,借助深度学习方法有效提取字符特征的能力,采用GCN学习字符间的全局语义特征,并且采用双向长短时记忆网络(BiLSTM)提取字符的上下文依赖特征;其次,融合以上特征并引入自注意力机制计算其内部重要度;最后,使用条件随机场(CRF)从融合特征中解码出最优的编码序列,并以此作为实体识别的结果。实验结果表明,与单一采用BiLSTM和CRF的方法相比,所提方法在微软亚洲研究院(MSRA)数据集和BioNLP/NLPBA 2004数据集上的精确率分别至少提高了2.39%和15.2%。可见该方法在中文和英文数据集上都具备良好的序列标注能力,且泛化能力较强。
中图分类号:
左亚尧, 陈皓宇, 陈致然, 洪嘉伟, 陈坤. 融合多语义特征的命名实体识别方法[J]. 计算机应用, 2022, 42(7): 2001-2008.
Yayao ZUO, Haoyu CHEN, Zhiran CHEN, Jiawei HONG, Kun CHEN. Named entity recognition method combining multiple semantic features[J]. Journal of Computer Applications, 2022, 42(7): 2001-2008.
命名实体类别 | 首字符 | 非首字符 |
---|---|---|
人名 | B-PER | I-PER |
地点 | B-LOC | I-LOC |
组织机构 | B-ORG | I-ORG |
表1 MSRA实体标签
Tab.1 MSRA entity labels
命名实体类别 | 首字符 | 非首字符 |
---|---|---|
人名 | B-PER | I-PER |
地点 | B-LOC | I-LOC |
组织机构 | B-ORG | I-ORG |
命名实体类别 | 首字符 | 非首字符 |
---|---|---|
protein | B-protein | I-protein |
DNA | B-DNA | I-DNA |
RNA | B-RNA | I-RNA |
cell line | B-line | I-line |
cell type | B-type | I-type |
表2 BioNLP/NLPBA 2004实体标签
Tab.2 BioNLP/NLPBA 2004 entity labels
命名实体类别 | 首字符 | 非首字符 |
---|---|---|
protein | B-protein | I-protein |
DNA | B-DNA | I-DNA |
RNA | B-RNA | I-RNA |
cell line | B-line | I-line |
cell type | B-type | I-type |
数据集 | 训练集 | 验证集 | 测试集 |
---|---|---|---|
MSRA | 37 091 | 9 273 | 4 365 |
BioNLP/NLPBA 2004 | 18 241 | 4 561 | 4 256 |
表3 数据集划分情况
Tab.3 Dataset division condition
数据集 | 训练集 | 验证集 | 测试集 |
---|---|---|---|
MSRA | 37 091 | 9 273 | 4 365 |
BioNLP/NLPBA 2004 | 18 241 | 4 561 | 4 256 |
参数名 | 值 | 单位 |
---|---|---|
文本长度 | 60 | 字符 |
向量维度 | 100 | 维 |
自注意力的头数 | 6 | — |
LSTM个数 | 2 | — |
GCN层数 | 2 | — |
Dropout参数 | 0.5 | — |
训练批次大小 | 40 | 轮次 |
滑动窗口大小 | 5 | — |
表4 模型参数设置
Tab.4 Model parameter setting
参数名 | 值 | 单位 |
---|---|---|
文本长度 | 60 | 字符 |
向量维度 | 100 | 维 |
自注意力的头数 | 6 | — |
LSTM个数 | 2 | — |
GCN层数 | 2 | — |
Dropout参数 | 0.5 | — |
训练批次大小 | 40 | 轮次 |
滑动窗口大小 | 5 | — |
项目 | 环境 |
---|---|
系统 | Windows 10 |
GPU | NVIDIA 3090Ti |
硬盘 | 1 TB |
内存 | 16 GB |
Python版本 | Python3.6 |
Pytorch版本 | Pytorch1.1 |
表5 软硬件环境
Tab.5 Software and hardware environments
项目 | 环境 |
---|---|
系统 | Windows 10 |
GPU | NVIDIA 3090Ti |
硬盘 | 1 TB |
内存 | 16 GB |
Python版本 | Python3.6 |
Pytorch版本 | Pytorch1.1 |
模型 | 精确率 | 召回率 | F1值 |
---|---|---|---|
文献[ | 88.94 | 84.20 | 86.51 |
文献[ | 91.22 | 81.71 | 86.20 |
文献[ | 91.86 | 88.75 | 90.28 |
文献[ | 92.20 | 90.18 | 91.18 |
文献[ | 91.28 | 90.62 | 90.95 |
本文模型 | 94.40 | 93.15 | 93.76 |
表6 MSRA数据集上的对比结果 (%)
Tab.6 Comparison results on MSRA dataset
模型 | 精确率 | 召回率 | F1值 |
---|---|---|---|
文献[ | 88.94 | 84.20 | 86.51 |
文献[ | 91.22 | 81.71 | 86.20 |
文献[ | 91.86 | 88.75 | 90.28 |
文献[ | 92.20 | 90.18 | 91.18 |
文献[ | 91.28 | 90.62 | 90.95 |
本文模型 | 94.40 | 93.15 | 93.76 |
模型 | 精确率 | 召回率 | F1值 |
---|---|---|---|
文献[ | 69.16 | 69.48 | 69.32 |
文献[ | — | — | 72.82 |
文献[ | 67.82 | 64.80 | 66.28 |
本文模型 | 79.67 | 78.70 | 79.00 |
表7 BioNLP/NLPBA 2004数据集上的对比结果 (%)
Tab.7 Comparison results on BioNLP/NLPBA dataset
模型 | 精确率 | 召回率 | F1值 |
---|---|---|---|
文献[ | 69.16 | 69.48 | 69.32 |
文献[ | — | — | 72.82 |
文献[ | 67.82 | 64.80 | 66.28 |
本文模型 | 79.67 | 78.70 | 79.00 |
1 | HUANG Z H, XU W, YU K. Bidirectional LSTM-CRF models for sequence tagging[EB/OL]. (2015-08-09) [2020-05-01].. |
2 | SONG H J, JO B C, PARK C Y, et al. Comparison of named entity recognition methodologies in biomedical documents[J]. Biomedical Engineering Online, 2018, 17: No.158. 10.1186/s12938-018-0573-6 |
3 | LUO Y, XIAO F S, ZHAO H. Hierarchical contextualized representation for named entity recognition[C]// Proceedings of the 2020 AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2020: 8441-8448. 10.1609/aaai.v34i05.6363 |
4 | GAJENDRAN S, MANJULA D, SUGUMARAN V. Character level and word level embedding with bidirectional LSTM-Dynamic recurrent neural network for biomedical named entity recognition from literature[J]. Journal of Biomedical Informatics, 2020, 112(1): No.103609. |
5 | DONG C H, ZHANG J J, ZONG C Q, et al. Character-based LSTM-CRF with radical-level features for Chinese named entity recognition[C]// Proceedings of the 2016 International Conference on Computer Processing of Oriental Languages/2016 National CCF Conference on Natural Language Processing and Chinese Computing, LNCS 10102. Cham: Springer, 2016: 239-250. |
6 | 张海楠,伍大勇,刘悦,等. 基于深度神经网络的中文命名实体识别[J]. 中文信息学报, 2017, 31(4):28-35. 10.3969/j.issn.1003-0077.2017.04.005 |
ZHANG H N, WU D Y, LIU Y, et al. Chinese named entity recognition based on deep neural network[J]. Journal of Chinese Information Processing, 2017, 31(4):28-35. 10.3969/j.issn.1003-0077.2017.04.005 | |
7 | 刘宇瀚,刘常健,徐睿峰,等. 结合字形特征与迭代学习的金融领域命名实体识别[J]. 中文信息学报, 2020, 34(11): 74-83. 10.3969/j.issn.1003-0077.2020.11.010 |
LIU Y H, LIU C J, XU R F, et al. Utilizing glyph feature and iterative learning for named entity recognition in finance text[J]. Journal of Chinese Information Processing, 2020, 34(11): 74-83. 10.3969/j.issn.1003-0077.2020.11.010 | |
8 | ZHANG Y, YANG J. Chinese NER using lattice LSTM[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Stroudsburg, PA: Association for Computational Linguistics, 2018: 1554-1564. 10.18653/v1/p18-1144 |
9 | YU X M, FENG W Z, WANG H, et al. An attention mechanism and multi-granularity-based Bi-LSTM model for Chinese Q&A system[J]. Soft Computing, 2020, 24(8):5831-5845. 10.1007/s00500-019-04367-8 |
10 | DAI J H, FENG C, BAI X F, et al. AERNs: attention-based entity region networks for multi-grained named entity recognition[C]// Proceedings of the IEEE 31st International Conference on Tools with Artificial Intelligence. Piscataway: IEEE, 2019: 408-415. 10.1109/ictai.2019.00064 |
11 | 张晗,郭渊博,李涛. 结合GAN与BiLSTM-Attention-CRF的领域命名实体识别[J]. 计算机研究与发展, 2019, 56(9):1851-1858. 10.7544/issn1000-1239.2019.20180733 |
ZHANG H, GUO Y B, LI T. Domain named entity recognition combining GAN and BiLSTM-Attention-CRF[J]. Journal of Computer Research and Development, 2019, 56(9):1851-1858. 10.7544/issn1000-1239.2019.20180733 | |
12 | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook, NY: Curran Associates Inc., 2017: 6000-6010. 10.1016/s0262-4079(17)32358-8 |
13 | 王得贤,王素格,裴文生,等. 基于JCWA-DLSTM的法律文书命名实体识别方法[J]. 中文信息学报, 2020, 34(10): 51-58. 10.3969/j.issn.1003-0077.2020.10.007 |
WANG D X, WANG S G, PEI W S, et al. Named entity recognition based on JCWA-DLSTM for legal instruments[J]. Journal of Chinese Information Processing, 2020, 34(10): 51-58. 10.3969/j.issn.1003-0077.2020.10.007 | |
14 | CHEN X Y, SHI S H, ZHAN S Y, et al. Named entity recognition of Chinese electronic medical records based on cascaded conditional random field[C]// Proceedings of the IEEE 4th International Conference on Big Data Analytics. Piscataway: IEEE, 2019: 364-368. 10.1109/icbda.2019.8713244 |
15 | SUN X L, SUN S L, YIN M Z, et al. Hybrid neural conditional random fields for multi-view sequence labeling[J]. Knowledge-Based Systems, 2020, 189: No.105151. 10.1016/j.knosys.2019.105151 |
16 | LIU J M, SUN C, YUAN Y. The BERT-BiLSTM-CRF question event information extraction method[C]// Proceedings of the IEEE 3rd International Conference on Electronic Information and Communication Technology. Piscataway: IEEE, 2020: 729-733. 10.1109/iceict51264.2020.9334197 |
17 | HU J M, ZHENG X. Opinion extraction of government microblog comments via BiLSTM-CRF model [C]// Proceedings of the 2020 ACM/IEEE Joint Conference on Digital Libraries. New York: ACM, 2020: 473-475. 10.1145/3383583.3398570 |
18 | YANG X M, GAO Z H, LI Y M, et al. Bidirectional LSTM-CRF for biomedical named entity recognition[C]// Proceedings of the 14th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery. Piscataway: IEEE, 2018: 239-242. 10.1109/fskd.2018.8687117 |
19 | 胡慧君,王聪,代建华,等.基于BiLSTM-CRF 的社会突发事件研判方法[J]. 中文信息学报, 2022, 36(3):154-161. |
HU H J, WANG C, DAI J H,et al. Social emergency event judgement based on BiLSTM-CRF[J]. Journal of Chinese Information Processing, 2022, 36(3):154-161. | |
20 | 古雪梅,刘嘉勇,程芃森,等. 基于增强 BiLSTM-CRF 模型的推文恶意软件名称识别[J]. 计算机科学, 2020, 47(2): 245-250. |
GU X M, LIU J Y, CHENG P S, et al. Malware name recognition in tweets based on enhanced BiLSTM-CRF model[J]. Computer Science, 2020, 47(2): 245-250. | |
21 | WU Z H, PAN S R, CHEN F W, et al. A comprehensive survey on graph neural networks[J]. IEEE Transactions on Neural Networks and Learning Systems, 2021, 32(1): 4-24. 10.1109/tnnls.2020.2978386 |
22 | YAO L, MAO C S, LUO Y. Graph convolutional networks for text classification[C]// Proceedings of the 33rd AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2019: 7370-7377. 10.1609/aaai.v33i01.33017370 |
23 | ZHOU J S, HE L, DAI X Y, et al. Chinese named entity recognition with a multi-phase model[C]// Proceedings of the 5th SIGHAN Workshop on Chinese Language Processing. Stroudsburg, PA: Association for Computational Linguistics, 2006: 213-216. |
24 | CHEN A T, PENG F C, SHAN R, et al. Chinese named entity recognition with conditional probabilistic models[C]// Proceedings of the 5th SIGHAN Workshop on Chinese Language Processing. Stroudsburg, PA: Association for Computational Linguistics, 2006: 173-176. |
25 | ZHOU J H, QU W G, ZHANG F. Chinese named entity recognition via joint identification and categorization[J]. Chinese Journal of Electronics, 2013, 22(2):225-230. |
26 | YIMAM S M, BIEMANN C, MAJNARIC L, et al. An adaptive annotation approach for biomedical entity and relation recognition[J]. Brain Informatics, 2016, 3:157-168. 10.1007/s40708-016-0036-4 |
27 | SONG Y, KIM E, LEE G G, et al. POSBIOTM-NER in the shared task of BioNLP/NLPBA2004[C]// Proceedings of the 2004 International Joint Workshop on Natural Language Processing in Biomedicine and its Applications. [S.l.]: COLING, 2004:103-106. 10.3115/1567594.1567617 |
[1] | 薛桂香, 王辉, 周卫峰, 刘瑜, 李岩. 基于知识图谱和时空扩散图卷积网络的港口交通流量预测[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2952-2957. |
[2] | 庞川林, 唐睿, 张睿智, 刘川, 刘佳, 岳士博. D2D通信系统中基于图卷积网络的分布式功率控制算法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2855-2862. |
[3] | 秦璟, 秦志光, 李发礼, 彭悦恒. 基于概率稀疏自注意力神经网络的重性抑郁疾患诊断[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2970-2974. |
[4] | 李力铤, 华蓓, 贺若舟, 徐况. 基于解耦注意力机制的多变量时序预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2732-2738. |
[5] | 孙焕良, 王思懿, 刘俊岭, 许景科. 社交媒体数据中水灾事件求助信息提取模型[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2437-2445. |
[6] | 刘禹含, 吉根林, 张红苹. 基于骨架图与混合注意力的视频行人异常检测方法[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2551-2557. |
[7] | 李欢欢, 黄添强, 丁雪梅, 罗海峰, 黄丽清. 基于多尺度时空图卷积网络的交通出行需求预测[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2065-2072. |
[8] | 徐泽鑫, 杨磊, 李康顺. 较短的长序列时间序列预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1824-1831. |
[9] | 吕锡婷, 赵敬华, 荣海迎, 赵嘉乐. 基于Transformer和关系图卷积网络的信息传播预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1760-1766. |
[10] | 黎施彬, 龚俊, 汤圣君. 基于Graph Transformer的半监督异配图表示学习模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1816-1823. |
[11] | 于右任, 张仰森, 蒋玉茹, 黄改娟. 融合多粒度语言知识与层级信息的中文命名实体识别模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1706-1712. |
[12] | 刘越, 刘芳, 武奥运, 柴秋月, 王天笑. 基于自注意力机制与图卷积的3D目标检测网络[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1972-1977. |
[13] | 黄荣, 宋俊杰, 周树波, 刘浩. 基于自监督视觉Transformer的图像美学质量评价方法[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1269-1276. |
[14] | 高龙涛, 李娜娜. 基于方面感知注意力增强的方面情感三元组抽取[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1049-1057. |
[15] | 杨先凤, 汤依磊, 李自强. 基于交替注意力机制和图卷积网络的方面级情感分析模型[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1058-1064. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||