基于刑事Electra的编-解码关系抽取模型

doi:10.11772/j.issn.1001-9081.2021020272

《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (1): 87-93.DOI: 10.11772/j.issn.1001-9081.2021020272

所属专题：人工智能

基于刑事Electra的编-解码关系抽取模型

王小鹏, 孙媛媛(), 林鸿飞

大连理工大学计算机科学与技术学院，辽宁大连 116024

收稿日期:2021-02-21 修回日期:2021-06-27 接受日期:2021-07-08 发布日期:2021-07-29 出版日期:2022-01-10
通讯作者: 孙媛媛
作者简介:王小鹏（1996—），男，甘肃天水人，硕士研究生，研究方向：自然语言处理
孙媛媛（1979—），女，辽宁大连人，教授，博士，主要研究方向：自然语言处理
林鸿飞（1962—），男，辽宁大连人，教授，博士，主要研究方向：自然语言处理。
基金资助:
国家重点研发计划项目(2018YFC0830603)

Encoding-decoding relationship extraction model based on criminal Electra

Xiaopeng WANG, Yuanyuan SUN(), Hongfei LIN

School of Computer Science and Technology，Dalian University of Technology，Dalian Liaoning 116024，China

Received:2021-02-21 Revised:2021-06-27 Accepted:2021-07-08 Online:2021-07-29 Published:2022-01-10
Contact: Yuanyuan SUN
About author:WANG Xiaopeng， born in 1996， M. S. candidate. His research interests include natural language processing.
SUN Yuanyuan， born in 1979， Ph. D.， professor. Her research interests include natural language processing
LIN Hongfei， born in 1962， Ph. D.， professor. His research interests include natural language processing.
Supported by:
National Key Research and Development Program of China(2018YFC0830603)

摘要/Abstract

摘要：

针对司法领域关系抽取任务中模型对句子上下文理解不充分、重叠关系识别能力弱的问题，提出了一种基于刑事Electra （CriElectra）的编-解码关系抽取模型。首先，参考中文Electra的训练方法，在1 000 000份刑事数据集上训练得到了CriElectra；然后，在双向长短期记忆网络（BiLSTM）模型上加入CriElectra的词特征进行司法文本的特征提取；最后，通过胶囊网络（CapsNet）对特征进行矢量聚类，从而实现实体间的关系抽取。实验结果表明，在自构建的故意伤害罪关系数据集上，与基于中文Electra的这一预训练语言模型相比，CriElectra在司法文本上的重训过程使得学习到的词向量蕴含更丰富的领域信息，且F1值提升了1.93个百分点；与基于池化聚类的模型相比，CapsNet通过矢量运算能够有效防止空间信息丢失，并提高重叠关系的识别能力，使得F1值提升了3.53个百分点。

关键词: 司法领域, 关系抽取, 预训练语言模型, 双向长短期记忆网络, 胶囊网络

Abstract:

Aiming at the problem that the model in the judicial field relation extraction task does not fully understand the context of sentence and has weak recognition ability of overlapping relations， based on Criminal-Efficiently learning an encoder that classi?es token replacements accurately （CriElectra）， an encoding-decoding relationship extraction model was proposed. Firstly， referred to the training method of Chinese Electra， CriElectra was trained on one million criminal dataset. Then， the word vectors of CriElectra were added to Bidirectional Long Short-Term Memory （BiLSTM） model for feature extraction of judicial texts. Finally， the vector clustering was performed to the features through Capsule Network （CapsNet）， so that the relationships between entities were extracted. Experimental results show that on the self-built relationship dataset of intentional injury crime， compared with the pre-trained language model based on Chinese Electra， CriElectra has retraining process on judicial texts to make the learned word vectors contain richer domain information， and the F1-score increased by 1.93 percentage points. Compared with the model based on pooling clustering， CapsNet can effectively prevent the loss of spatial information by vector operation and improve the recognition ability of overlapping relationships， which increases the F1-score by 3.53 percentage points.

Key words: judicial field, relation extraction, pretrained language model, Bidirectional Long Short-Term Memory (BiLSTM), Capsule Network (CapsNet)

中图分类号:

TP391.1

王小鹏, 孙媛媛, 林鸿飞. 基于刑事Electra的编-解码关系抽取模型[J]. 计算机应用, 2022, 42(1): 87-93.

Xiaopeng WANG, Yuanyuan SUN, Hongfei LIN. Encoding-decoding relationship extraction model based on criminal Electra[J]. Journal of Computer Applications, 2022, 42(1): 87-93.

图/表 9

图1 CELCN模型结构

Fig. 1 Structure of CELCN model

图2 CriElectra训练示例

Fig. 2 CriElectra training example

图3 胶囊网络模型结构

Fig. 3 Structure of capsule network model

图4 关系分布

Fig. 4 Relationship distribution

图5 CELCN与ELCN的F1值曲线

Fig. 5 F1-score curves of CELCN and ELCN

图6 CELCN与MBLCN、XBLCN的F1值曲线

Fig. 6 F1-score curves of CELCN， MBLCN and XBLCN

表1 不同模型的性能对比 ( %)

Tab. 1 Performance comparison of different models

对比设置	模型	精确率	召回率	F1值
预训练模型实验对比	XBLCN	77.86	83.33	80.51
	MBLCN	76.73	78.08	76.58
	ELCN	75.72	81.51	77.95
特征提取模型实验对比	CERCN	75.13	81.57	78.21
	CECCN	76.64	78.72	77.26
	CECN	76.89	80.98	79.47
特征聚类模型实验对比	CELAP	72.52	80.82	76.15
特征聚类模型实验对比	CELMP	76.48	78.31	76.35
本文模型	CELCN	77.26	82.68	79.88

图7 CELMP和CELAP的F1值曲线

Fig. 7 F1-score curves of CELMP and CELAP

表2 部分多标签重叠关系数据上的实验结果 ( %)

Tab. 2 Experimental results of some multi-label overlapping relationship data

模型方法	精确率	召回率	F1值
CELAP	39.78	45.16	42.30
CELMP	42.97	34.65	38.65
CELCN	43.88	41.32	42.56

参考文献 28

1	率蕴铤，顾克广. 司法文书［M］. 北京：中国政法大学出版社， 1996：1-2.
	SHUAI Y T， GU K G. Judicial Documents［M］. Beijing： China University of Political Science and Law Press， 1996： 1-2.
2	TOLIAS G， SICRE R， JÉGOU H. Particular object retrieval with integral max-pooling of CNN activations［EB/OL］. （2016-02-24）［2021-02-27］..
3	ZHOU P， SHI W， TIAN J， et al. Attention-based bidirectional long short-term memory networks for relation classification［C］// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Stroudsburg， PA： Association for Computational Linguistics， 2016： 207-212. 10.18653/v1/p16-2034
4	CHENG J P， DONG L， LAPATA M. Long short-term memory-networks for machine reading［C］// Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Stroudsburg， PA： Association for Computational Linguistics， 2016： 551-561. 10.18653/v1/d16-1053
5	SABOUR S， FROSST N， HINTON G E. Dynamic routing between capsules［C］// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2017： 3859-3869.
6	CLARK K， LUONG M T， LE Q V， et al. ELECTRA： pre-training text encoders as discriminators rather than generators［EB/OL］. （2020-03-23）［2021-02-27］.. 10.18653/v1/2020.emnlp-main.20
7	ZHOU G D， SU J， ZHANG J， et al. Exploring various knowledge in relation extraction［C］// Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics. Stroudsburg， PA： Association for Computational Linguistics， 2005： 427-434. 10.3115/1219840.1219893
8	LIU C Y， SUN W B， CHAO W H， et al. Convolution neural network for relation extraction［C］// Proceedings of the 9th International Conference on Advanced Data Mining and Applications， LNCS8347. Berlin： Springer， 2013： 231-242.
9	ZHANG R Y， MENG F R， ZHOU Y， et al. Relation classification via recurrent neural network with attention and tensor layers［J］. Big Data Mining and Analytics， 2018， 1（3）： 234-244. 10.26599/bdma.2018.9020022
10	孙紫阳，顾君忠，杨静. 基于深度学习的中文实体关系抽取方法［J］. 计算机工程， 2018， 44（9）：164-170. 10.19678/j.issn.1000-3428.0048518
	SUN Z Y， GU J Z， YANG J. Chinese entity relation extraction method based on deep learning［J］. Computer Engineering， 2018， 44（9）： 164-170. 10.19678/j.issn.1000-3428.0048518
11	LU T B， GAO P， DU X F， et al. An analysis of active attacks on anonymity systems［J］. International Journal of Security and its Applications， 2016， 10（4）：95-104. 10.14257/ijsia.2016.10.4.11
12	KIYAVASH N， HOUMANSADR A， BORISOV N. Multi-flow attacks against network flow watermarking schemes［C］// Proceedings of the 17th USENIX Security Symposium. Berkeley： USENIX Association， 2008： 307-320. 10.1109/icassp.2009.4959879
13	LUO X P， ZHANG J J， PERDISCI R， et al. On the secrecy of spread-spectrum flow watermarks［C］// Proceedings of the 15th European Conference on Research in Computer Security， LNCS6345. Berlin： Springer， 2010： 232-248.
14	PETERS M， NEUMANN M， IYYER M， et al. Deep contextualized word representations［C］// Proceedings of the 2018 North American Chapter of the Association for Computational Linguistics. Stroudsburg， PA： Association for Computational Linguistics， 2018： 2227-2237. 10.18653/v1/n18-1202
15	DEVLIN J， CHANG M W， LEE K， et al. BERT： pre-training of deep bidirectional transformers for language understanding［EB/OL］. ［2021-02-27］. . 10.18653/v1/n19-1423
16	RADFORD A， WU J， CHILD R， et al. Language models are unsupervised multitask learners［EB/OL］. ［2021-02-27］..
17	YANG Z L， DAI Z H， YANG Y M， et al. XLNet： generalized autoregressive pretraining for language understanding［C/OL］// Proceedings of the 33rd Conference on Neural Information Processing Systems. ［2021-02-27］.. 10.1145/3369985.3370025
18	李妮，关焕梅，杨飘，等. 基于BERT-IDCNN-CRF的中文命名实体识别方法［J］. 山东大学学报（理学版）， 2020， 55（1）：102-109. 10.6040/j.issn.1671-9352.2.2019.076
	LI N， GUAN H M， YANG P， et al. BERT-IDCNN-CRF for named entity recognition in Chinese［J］. Journal of Shandong University （Natural Science）， 2020， 55（1）：102-109. 10.6040/j.issn.1671-9352.2.2019.076
19	王子牛，姜猛，高建瓴，等. 基于BERT的中文命名实体识别方法［J］. 计算机科学， 2019， 46（11A）：138-142.
	WANG Z N， JIANG M， GAO J L， et al. Chinese named entity recognition method based on BERT ［J］. Computer Science， 2019， 46（11A）： 138-142.
20	尹学振，赵慧，赵俊保，等. 多神经网络协作的军事领域命名实体识别［J］. 清华大学学报（自然科学版）， 2020， 60（8）：648-655. 10.16511/j.cnki.qhdxxb.2020.25.004
	YIN X Z， ZHAO H， ZHAO J B， et al. Multi-neural network collaboration for Chinese military named entity recognition［J］. Journal of Tsinghua University （Science and Technology）， 2020， 60（8）： 648-655. 10.16511/j.cnki.qhdxxb.2020.25.004
21	王月，王孟轩，张胜，等. 基于BERT的警情文本命名实体识别［J］. 计算机应用， 2020， 40（2）：535-540. 10.11772/j.issn.1001-9081.2019101717
	WANG Y， WANG M X， ZHANG S， et al. Alarm text named entity recognition based on BERT［J］. Journal of Computer Applications， 2020， 40（2）： 535-540. 10.11772/j.issn.1001-9081.2019101717
22	LEE J， YOON W， KIM S， et al. BioBERT： a pre-trained biomedical language representation model for biomedical text mining［J］. Bioinformatics， 2020， 36（4）： 1234-1240. 10.1093/bioinformatics/btz682
23	HINTON G E ， KRIZHEVSKY A ， WANG S D. Transforming auto-encoders［C］// Proceedings of the 21st International Conference on Artificial Neural Networks， LNCS6791. Berlin： Springer， 2012： 44-51.
24	HINTON G E， SABOUR S， FROSST N. Matrix capsules with EM routing［EB/OL］. ［2021-02-27］..
25	ZHANG N Y， DENG S M， SUN Z L， et al. Attention-based capsule networks with dynamic routing for relation extraction［C］// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Stroudsburg， PA： Association for Computational Linguistics， 2018： 986-992. 10.18653/v1/d18-1120
26	ZHANG X S， LI P S， JIA W J， et al. Multi-labeled relation extraction with attentive capsule network［C］// Proceedings of the 33rd AAAI Conference on Artificial Intelligence， Palo Alto， CA： AAAI Press， 2019： 7484-7491. 10.1609/aaai.v33i01.33017484
27	ZHAO W， YE J B， YANG M， et al. Investigating capsule networks with dynamic routing for text classification［C］// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Stroudsburg， PA： Association for Computational Linguistics， 2018： 3110-3119. 10.18653/v1/d18-1350
28	王祺，邱家辉，阮彤，等. 基于循环胶囊网络的临床语义关系识别研究［J］. 广西师范大学学报（自然科学版）， 2019， 37（1）： 80-88. 10.1007/978-3-030-26072-9_6
	WANG Q， QIU J H， RUAN T， et al. Recurrent capsule network for clinical relation extraction［J］. Journal of Guangxi Normal University （Natural Science Edition）， 2019， 37（1）： 80-88. 10.1007/978-3-030-26072-9_6

[1]	吴相岚, 肖洋, 刘梦莹, 刘明铭. 基于语义增强模式链接的Text-to-SQL模型[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2689-2695.
[2]	赵宇博, 张丽萍, 闫盛, 侯敏, 高茂. 基于改进分段卷积神经网络和知识蒸馏的学科知识实体间关系抽取[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2421-2429.
[3]	汪雨晴, 朱广丽, 段文杰, 李书羽, 周若彤. 基于交互注意力机制的心理咨询文本情感分类模型[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2393-2399.
[4]	唐媛, 陈艳平, 扈应, 黄瑞章, 秦永彬. 基于多尺度混合注意力卷积神经网络的关系抽取模型[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2011-2017.
[5]	毛典辉, 李学博, 刘峻岭, 张登辉, 颜文婧. 基于并行异构图和序列注意力机制的中文实体关系抽取模型[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2018-2025.
[6]	魏超, 陈艳平, 王凯, 秦永彬, 黄瑞章. 基于掩码提示与门控记忆网络校准的关系抽取方法[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1713-1719.
[7]	吕锡婷, 赵敬华, 荣海迎, 赵嘉乐. 基于Transformer和关系图卷积网络的信息传播预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1760-1766.
[8]	袁泉, 陈昌平, 陈泽, 詹林峰. 基于BERT的两次注意力机制远程监督关系抽取[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1080-1085.
[9]	罗歆然, 李天瑞, 贾真. 基于自注意力机制与词汇增强的中文医学命名实体识别[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 385-392.
[10]	郭安迪, 贾真, 李天瑞. 基于伪实体数据增强的高精准率医学领域实体关系抽取[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 393-402.
[11]	颜新月, 杨淑群, 高永彬. 基于证据增强与多特征融合的文档级关系抽取[J]. 《计算机应用》唯一官方网站, 2024, 44(11): 3379-3385.
[12]	邓金科, 段文杰, 张顺香, 汪雨晴, 李书羽, 李嘉伟. 基于提示增强与双图注意力网络的复杂因果关系抽取[J]. 《计算机应用》唯一官方网站, 2024, 44(10): 3081-3089.
[13]	陈克正, 郭晓然, 钟勇, 李振平. 基于负训练和迁移学习的关系抽取方法[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2426-2430.
[14]	金泽熙, 李磊, 刘继. 基于改进领域分离网络的迁移学习模型[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2382-2389.
[15]	黄梦林, 段磊, 张袁昊, 王培妍, 李仁昊. 基于Prompt学习的无监督关系抽取模型[J]. 《计算机应用》唯一官方网站, 2023, 43(7): 2010-2016.

基于刑事Electra的编-解码关系抽取模型

Encoding-decoding relationship extraction model based on criminal Electra

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 9

参考文献 28

相关文章 15

编辑推荐

Metrics