Multi-task learning model for charge prediction with action words

doi:10.11772/j.issn.1001-9081.2023010029

Journal of Computer Applications ›› 2024, Vol. 44 ›› Issue (1): 159-166.DOI: 10.11772/j.issn.1001-9081.2023010029

• Artificial intelligence • Previous Articles

Multi-task learning model for charge prediction with action words

Xiao GUO¹^,², Yanping CHEN¹^,²(), Ruixue TANG¹^,²^,³, Ruizhang HUANG¹^,², Yongbin QIN¹^,²

^1.State Key Laboratory of Public Big Data （Guizhou University），Guiyang Guizhou 550025，China
^2.College of Computer Science and Technology，Guizhou University，Guiyang Guizhou 550025，China
^3.School of Information，Guizhou University of Finance and Economics，Guiyang Guizhou 550025，China

Received:2023-01-11 Revised:2023-03-18 Accepted:2023-03-28 Online:2023-06-06 Published:2024-01-10
Contact: Yanping CHEN
About author:GUO Xiao， born in 1998， M. S. candidate. His research interests include natural language processing， information extraction.
TANG Ruixue， born in 1987， Ph. D. candidate. Her research interests include natural language processing.
HUANG Ruizhang， born in 1979， Ph. D.， professor. Her research interests include text mining， data fusion.
QIN Yongbin， born in 1980， Ph. D.， professor. His research interests include enterprise informatization， e-government.
Supported by:
National Natural Science Foundation of China(62166007);Youth Science and Technology Talents Growth Project of Guizhou Education Department(KY［2022］205)

融合行为词的罪名预测多任务学习模型

郭晓¹^,², 陈艳平¹^,²(), 唐瑞雪¹^,²^,³, 黄瑞章¹^,², 秦永彬¹^,²

^1.公共大数据国家重点实验室(贵州大学), 贵阳 550025
^2.贵州大学计算机科学与技术学院, 贵阳 550025
^3.贵州财经大学信息学院, 贵阳 550025

通讯作者: 陈艳平
作者简介:郭晓（1998—），男，山西阳泉人，硕士研究生，CCF会员，主要研究方向：自然语言处理、信息抽取；
唐瑞雪（1987—），女，贵州贵阳人，博士研究生，主要研究方向：自然语言处理；
黄瑞章（1979—），女，天津人，教授，博士，CCF会员，主要研究方向：文本挖掘、数据融合；
秦永彬（1980—），男，山东招远人，教授，博士，CCF会员，主要研究方向：企业信息化、电子政务。
第一联系人：陈艳平（1980—），男，贵州长顺人，教授，博士，CCF会员，主要研究方向：人工智能、自然语言处理；
基金资助:
国家自然科学基金资助项目(62166007);贵州省教育厅青年科技人才成长项目(黔教合KY字［2022］205号)

Abstract

Abstract:

With the application of artificial intelligence technology in the judicial field， charge prediction based on case description has become an important research content. It aims at predicting the charges according to the case description. The terms of case contents are professional， and the description is concise and rigorous. However， the existing methods often rely on text features， but ignore the difference of relevant elements and lack effective utilization of elements of action words in diverse cases. To solve the above problems， a multi-task learning model of charge prediction based on action words was proposed. Firstly， the spans of action words were generated by boundary identifier， and then the core contents of the case were extracted. Secondly， the subordinate charge was predicted by constructing the structure features of action words. Finally， identification of action words and charge prediction were uniformly modeled， which enhanced the generalization of the model by sharing parameters. A multi-task dataset with action word identification and charge prediction was constructed for model verification. The experimental results show that the proposed model achieves the F value of 83.27% for action word identification task， and the F value of 84.29% for charge prediction task； compared with BERT-CNN， the F value respectively increases by 0.57% and 2.61%， which verifies the advantage of the proposed model in identification of action words and charge prediction.

Key words: charge prediction, action word, boundary identification, graph convolution neural network, multi-task learning

摘要：

随着人工智能技术在司法领域的应用，依据案情描述预测所属罪名成为一项重要研究内容。案情内容术语专业，描述言简意赅，而现有方法却往往依赖文本特征，忽略了不同案件相关要素的差异性，缺乏对案情行为词要素的有效利用。为了解决此类问题，提出一种融合行为词的罪名预测多任务学习模型。首先，由边界识别器生成行为词跨度，提炼出案情核心内容；其次，通过构建行为词的结构特征预测所属罪名；最后，将行为词识别和罪名预测进行统一建模，通过共享参数的方式增强模型的泛化能力。通过构建行为词识别和罪名预测的多任务数据集进行验证，实验结果表明该模型识别行为词任务的F值达到了83.27%，罪名预测任务的F值达到了84.29%，与BERT-CNN模型相比，分别提高了0.57%和2.61%，验证了该模型对行为词识别和罪名预测的优势。

关键词: 罪名预测, 行为词, 边界识别, 图卷积神经网络, 多任务学习

CLC Number:

TP391

Xiao GUO, Yanping CHEN, Ruixue TANG, Ruizhang HUANG, Yongbin QIN. Multi-task learning model for charge prediction with action words[J]. Journal of Computer Applications, 2024, 44(1): 159-166.

郭晓, 陈艳平, 唐瑞雪, 黄瑞章, 秦永彬. 融合行为词的罪名预测多任务学习模型[J]. 《计算机应用》唯一官方网站, 2024, 44(1): 159-166.

Figures/Tables 11

Fig. 1 Examples of action words and charges

Fig. 2 Multi-task learning model for charge predictions with action words

Tab. 1 Number distribution of action words and charges

罪名类型	数据规模	行为词数
故意杀人罪	242	2 493
故意伤害罪	115	1 052
贩卖毒品罪	21	200
运输毒品罪	13	101
假冒注册商标罪	26	173
危险驾驶罪	13	83
走私罪	97	823
抢劫罪	12	130
盗窃罪	16	136
诈骗罪	51	481
受贿罪	12	86
走私、贩卖、运输、制造毒品罪	81	626

Tab. 2 Parameter setting

参数	设定值	参数	设定值
Batch size	4	学习率	5×10^-6
Epoch	100	学习率预热值	0.1
Dropout	0.5	GCN层数	2
字向量维度	1 024	输入序列长度	512
权重衰减率	0.01

Tab. 3 Experimental result comparison among different models

模型	行为词识别			罪名预测 $F m i c r o$
模型	P	R	F	罪名预测 $F m i c r o$
BERT-Attention^［32］	79.33	81.12	80.22	80.00
BERT-BiGRU^［33］	78.65	82.45	80.50	80.71
BERT-CNN^［34］	81.18	84.48	82.80	82.14
本文模型	82.69	83.85	83.27	84.29

Tab. 3 Experimental result comparison among different models

模型	行为词识别			罪名预测 $F m i c r o$
模型	P	R	F	罪名预测 $F m i c r o$
BERT-Attention^［32］	79.33	81.12	80.22	80.00
BERT-BiGRU^［33］	78.65	82.45	80.50	80.71
BERT-CNN^［34］	81.18	84.48	82.80	82.14
本文模型	82.69	83.85	83.27	84.29

Fig. 3 Comparison experiment results with or without semantics of action words

Tab. 4 Experiment results of confusible charge prediction

方法	$F m i c r o$
方法	故意杀人罪	故意伤害罪
文本语义	82.47	57.17
行为词语义	84.85	60.00

Tab. 4 Experiment results of confusible charge prediction

方法	$F m i c r o$
方法	故意杀人罪	故意伤害罪
文本语义	82.47	57.17
行为词语义	84.85	60.00

Tab. 5 Comparative experiment results of action word recognition with single task models

模型	行为词识别
模型	P	R	F
BERT-IDCNN-CRF^［36］	80.08	80.89	80.48
BERT-BIGRU-CRF^［37］	81.57	81.83	81.70
BERT-BILSTM-Attention^［38］	83.94	80.34	82.10
本文模型	82.69	83.85	83.27

Tab. 6 Comparative experiment results of predict charge with single task models

模型	罪名预测F_micro
TextRNN-Attention^［39］	70.71
FastText^［40］	77.14
TextCNN^［41］	80.71
本文模型	84.29

Tab. 7 Ablation experiment results

模型	行为词识别			罪名预测F_micro
模型	P	R	F	罪名预测F_micro
本文模型	82.69	83.85	83.27	84.29
-相关性构建	82.00	83.15	82.57	82.14
-边界识别	76.65	81.67	79.08	83.57

Fig. 4 Case study

References 41

1	WEI B， KUANG K， SUN C， et al. A full-process intelligent trial system for smart court ［J］. Frontiers of Information Technology & Electronic Engineering， 2022， 23： 186-206. 10.1631/fitee.2100041
2	KORT F. Predicting Supreme Court decisions mathematically： A quantitative analysis of the “right to counsel” cases ［J］. American Political Science Review， 1957， 51（1）： 1-12. 10.2307/1951767
3	NAGEL S S. Applying correlation analysis to case prediction ［J］. Texas Law Review， 1963， 42： 1006.
4	于游，付钰，吴晓平.中文文本分类方法综述［J］.网络与信息安全学报， 2019， 5（5）： 1-8. 10.11959/j.issn.2096-109x.2019045
	YU Y， FU Y， WU X P. Summary of text classification methods ［J］. Chinese Journal of Network and Information Security， 2019， 5（5）： 1-8. 10.11959/j.issn.2096-109x.2019045
5	白昌前，代晓，张岸.基于数据增强和改进BERT的罪名预测［J］.电脑与信息技术， 2023， 31（1）： 37-40. 10.3969/j.issn.1005-1228.2023.01.012
	BAI C Q， DAI X， ZHANG A. Crime prediction based on data enhancement and improved BERT ［J］. Computer and Information Technology， 2023， 31（1）： 37-40. 10.3969/j.issn.1005-1228.2023.01.012
6	彭韬，杨亮，张琍，等.联合多源分析的罪名预测研究［J］.计算机工程与应用， 2023， 59（4）： 290-296. 10.3778/j.issn.1002-8331.2108-0339
	PENG T， YANG L， ZHANG L， et al. Research on charge prediction based on multi-source joint analysis ［J］. Computer Engineering and Applications， 2023， 59（4）： 290-296. 10.3778/j.issn.1002-8331.2108-0339
7	HU Z， LI X， TU C， et al. Few-shot charge prediction with discriminative legal attributes ［C］// Proceedings of the 27th International Conference on Computational Linguistics. Stroudsburg， PA： Association for Computational Linguistics， 2018： 487-498.
8	刘宗林，张梅山，甄冉冉，等.融入罪名关键词的法律判决预测多任务学习模型［J］.清华大学学报（自然科学版）， 2019， 59（7）： 497-504.
	LIU Z L， ZHANG M S， ZHEN R R， et al. Multi-task learning model for legal judgment predictions with charge keywords ［J］. Journal of Tsinghua University （Science and Technology）， 2019， 59（7）： 497-504.
9	李婷，秦永彬，黄瑞章，等.基于神经网络的中文谓语动词识别研究［J］.数据采集与处理， 2020， 35（3）： 582-590.
	LI T， QIN Y B， HUANG R Z， et al. Research on Chinese predicate verb recognition based on neural network ［J］. Journal of Data Acquisition & Processing， 2020， 35（3）： 582-590.
10	DEVLIN J， CHANG M-W， LEE K， et al. BERT： Pre-training of deep bidirectional transformers for language understanding ［EB/OL］. ［2019-05-24］. . 10.18653/v1/n18-2
11	SCHUSTER M， PALIWAL K K. Bidirectional recurrent neural networks ［J］. IEEE Transactions on Signal Processing， 1997， 45（11）： 2673-2681. 10.1109/78.650093
12	KIPF T N， WELLING M. Semi-supervised classification with graph convolutional networks ［EB/OL］. ［2019-09-09］. . 10.48550/arXiv.1609.02907
13	ULMER S S. Quantitative analysis of judicial processes： Some practical and theoretical applications ［J］. Law and Contemporary Problems， 1963， 28（1）： 164-184. 10.2307/1190728
14	KEOWN R. Mathematical models for legal prediction ［J］. Computer/Law Journal， 1980， 2： 829-862. 10.1016/0270-0255(80)90042-1
15	LIU C-L， CHANG C-T， J-H HO. Case instance generation and refinement for case-based criminal summary judgments in Chinese ［J］. Journal of Information Science and Engineering， 2004， 20： 783-800. 10.1145/1047788.1047840
16	BIJALWAN V， KUMARI V， KUMARI P， et al. KNN based machine learning approach for text and document mining ［J］. International Journal of Database Theory and Application， 2014， 7（1）： 61-70. 10.14257/ijdta.2014.7.1.06
17	O-M SULEA， ZAMPIERI M， MALMASI S， et al. Exploring the use of text classification in the legal domain ［EB/OL］. ［2017-10-25］. . 10.1109/icds47004.2019.8942343
18	ZHANG H， BERG A C， MAIRE M， et al. SVM-KNN： Discriminative nearest neighbor classification for visual category recognition ［C］// Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2006： 2126-2136.
19	LIU Y-H， CHEN Y-L， W-L HO. Predicting associated statutes for legal problems ［J］. Information Processing & Management， 2015， 51（1）： 194-211. 10.1016/j.ipm.2014.07.003
20	LIN W-C， T-T KUO， CHANG T-J， et al. Exploiting machine learning models for Chinese legal documents labeling， case classification， and sentencing prediction ［J］. Computational Linguistic and Chinese Language Processing， 2012， 17（4）： 49-68.
21	LAFFERTY J D， McCALLUM A， PEREIRA F C. Conditional random fields： Probabilistic models for segmenting and labeling sequence data ［C］// Proceedings of the 18th International Conference on Machine Learning. San Francisco： Morgan Kaufmann Publishers Inc.， 2001： 282-289.
22	LUO B， FENG Y， XU J， et al. Learning to predict charges for criminal cases with legal basis ［EB/OL］. ［2017-07-28］. . 10.18653/v1/d17-1289
23	LONG S， TU C， LIU Z， et al. Automatic judgment prediction via legal reading comprehension ［C］// Proceedings of the 2019 China National Conference on Chinese Computational Linguistics. Cham： Springer， 2019： 558-572. 10.1007/978-3-030-32381-3_45
24	YE H， JIANG X， LUO Z， et al. Interpretable charge predictions for criminal cases： Learning to generate court views from fact descriptions ［EB/OL］. ［2018-02-23］. . 10.18653/v1/n18-1168
25	赵琪珲，李大鹏，高天寒，等.基于图注意力网络的案件罪名预测方法： CP-GAT ［J］.东北大学学报（自然科学版）， 2021， 42（12）： 1681-1687.
	ZHAO Q H， LI D P， GAO T H， et al. A charge prediction method based on graph attention network： CP-GAT ［J］. Journal of Northeastern University （Natural Science）， 2021， 42（12）： 1681-1687.
26	孙倩，秦永彬，黄瑞章，等.结合案件要素序列的罪名预测方法［J］.大数据， 2021， 7（6）： 30-40. 10.11959/j.issn.2096-0271.2021058
	SUN Q， QIN Y B， HUANG R Z， et al. Charge prediction method combined with case elements sequence ［J］. Big Data Research， 2021， 7（6）： 30-40. 10.11959/j.issn.2096-0271.2021058
27	倪晴超，殷聪珏，赵冬华.基于BERT和关键词的属性-罪名分类［J］.计算机应用， 2021， 41（S2）： 36-40.
	NI Q C， YIN C J， ZHAO D H. Classification of attributes and charges based on BERT and keywords ［J］. Journal of Computer Applications， 2021， 41（S2）： 36-40.
28	王卓越，陈彦光，邢铁军，等.基于多任务学习的多罪名案件信息联合抽取［J］.计算机工程与应用， 2023， 59（2）： 178-184. 10.3778/j.issn.1002-8331.2108-0344
	WANG Z Y， CHEN Y G， XING T J， et al. Joint entity and relation extraction for multi-crime legal documents with multi-task learning ［J］. Computer Engineering and Applications， 2023， 59（2）： 178-184. 10.3778/j.issn.1002-8331.2108-0344
29	HENDRYCKS D， GIMPEL K. Gaussian Error Linear Units （GELUs）［EB/OL］. ［2020-07-08］. .
30	GIRSHICK R. Fast R-CNN ［C］// Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2015： 1440-1448. 10.1109/iccv.2015.169
31	CHEN Y， JIN W， QIN Y， et al. Annotation of Chinese predicate heads and relevant elements ［EB/OL］. ［2021-04-01］. .
32	毛明毅，吴晨，钟义信，等.加入自注意力机制的BERT命名实体识别模型［J］.智能系统学报， 2020， 15（4）： 772-779. 10.11992/tis.202003003
	MAO M Y， WU C， ZHONG Y X， et al. BERT named entity recognition model with self-attention mechanism ［J］. CAAI Transactions on Intelligent Systems， 2020， 15（4）： 772-779. 10.11992/tis.202003003
33	YU Q， WANG Z， JIANG K. Research on text classification based on BERT-BiGRU model ［J］. Journal of Physics： Conference Series， 2021， 1746： 012019. 10.1088/1742-6596/1746/1/012019
34	史振杰，董兆伟，庞超逸，等.基于BERT-CNN的电商评论情感分析［J］.智能计算机与应用， 2020， 10（2）： 7-11. 10.3969/j.issn.2095-2163.2020.02.002
	SHI Z J， DONG Z W， PANG C Y， et al. Sentiment analysis of e-commerce reviews based on BERT-CNN ［J］. Intelligent Computer and Applications， 2020， 10（2）： 7-11. 10.3969/j.issn.2095-2163.2020.02.002
35	VASWANI A， SHAZEER N， PARMAR N， et al. Attention is all you need ［C］// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2017： 6000-6010.
36	CAI X， SUN E， LEI J. Research on application of named entity recognition of electronic medical records based on BERT-IDCNN-CRF model ［C］// Proceedings of the 6th International Conference on Graphics and Signal Processing. New York： ACM， 2022： 80-85. 10.1145/3561518.3561531
37	QIN Q， ZHAO S， LIU C. A BERT-BiGRU-CRF model for entity recognition of Chinese electronic medical records ［J］. Complexity， 2021， 2021： 6631837. 10.1155/2021/6631837
38	倪健，陈鹏兴.基于Bert-BiLSTM-Attention的互联网金融实体识别方法［J］.信息与电脑， 2021， 33（20）： 58-61. 10.3969/j.issn.1003-9767.2021.20.018
	NI J， CHEN P X. Internet financial entity recognition method based on BiLSTM-attention ［J］. China Computer & Communication， 2021， 33（20）： 58-61. 10.3969/j.issn.1003-9767.2021.20.018
39	LIU P， QIU X， HUANG X. Recurrent neural network for text classification with multi-task learning ［EB/OL］. ［2016-05-17］. . 10.18653/v1/d16-1012
40	JOULIN A， GRAVE E， BOJANOWSKI P， et al. Bag of tricks for efficient text classification ［EB/OL］. ［2016-08-09］. . 10.18653/v1/e17-2068
41	ZHANG Y， WALLACE B. A sensitivity analysis of （and practitioners’ guide to） convolutional neural networks for sentence classification ［EB/OL］. ［2016-04-06］. . 10.18653/v1/d16-1076

[1]	Jianhui HE, Chunlong HU, Xin SHU. Multi-task age estimation method based on multi-peak label distribution learning [J]. Journal of Computer Applications, 2023, 43(5): 1578-1583.
[2]	Ying CHEN, Jiong YU, Jiaying CHEN, Xusheng DU. Cross-layer data sharing based multi-task model [J]. Journal of Computer Applications, 2022, 42(5): 1447-1454.
[3]	Qiming RUAN, Yi GUO, Nan ZHENG, Yexiang WANG. Customs declaration good classification algorithm based on hierarchical multi-task BERT [J]. Journal of Computer Applications, 2022, 42(1): 71-77.
[4]	WU Guoliang, XU Jining. Chinese emergency event extraction method based on named entity recognition task feedback enhancement [J]. Journal of Computer Applications, 2021, 41(7): 1891-1896.
[5]	YAO Jie, CHENG Chunling, HAN Jing, LIU Zheng. Anomaly detection method based on multi-task temporal convolutional network in cloud workflow [J]. Journal of Computer Applications, 2021, 41(6): 1701-1708.
[6]	ZHANG Sun, YIN Chunyong. Sequential multimodal sentiment analysis model based on multi-task learning [J]. Journal of Computer Applications, 2021, 41(6): 1631-1639.
[7]	FU Ying, WANG Hongling, WANG Zhongqing. Scientific paper summarization model using macro discourse structure [J]. Journal of Computer Applications, 2021, 41(10): 2864-2870.
[8]	RUAN Canhua, LIN Jiaxiang. Multi-task Logistic survival prediction method for time-dependent time-to-event data [J]. Journal of Computer Applications, 2020, 40(5): 1284-1290.
[9]	CAO Jinmeng, NI Rongrong, YANG Biao. Crowd counting using multi-scale multi-task convolutional neural network [J]. Journal of Computer Applications, 2019, 39(1): 199-204.
[10]	CHENG Jin, WANG Jian. Endpoint prediction method for steelmaking based on multi-task learning [J]. Journal of Computer Applications, 2017, 37(3): 889-895.
[11]	OUYANG Ning, MA Yutao, LIN Leping. Multi-pose face reconstruction and recognition based on multi-task learning [J]. Journal of Computer Applications, 2017, 37(3): 896-900.
[12]	MO Yiwen, JI Donghong, HUANG Jiangping. Slight-pause marks boundary identification based on conditional random field [J]. Journal of Computer Applications, 2015, 35(10): 2838-2842.
[13]	ZHANG Wei LIU Xian-hui DING Yi SHI De-ming. Multiple time series autoregressive method based on support vector regression [J]. Journal of Computer Applications, 2012, 32(09): 2508-2511.
[14]	Jun-Feng HU . Algorithm based on perceptron for biomedical [J]. Journal of Computer Applications, 2007, (12): 3026-3028.

Multi-task learning model for charge prediction with action words

融合行为词的罪名预测多任务学习模型

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 11

References 41

Related Articles 14

Recommended Articles

Metrics