《计算机应用》唯一官方网站 ›› 2024, Vol. 44 ›› Issue (1): 159-166.DOI: 10.11772/j.issn.1001-9081.2023010029
所属专题: 人工智能
郭晓1,2, 陈艳平1,2(), 唐瑞雪1,2,3, 黄瑞章1,2, 秦永彬1,2
收稿日期:
2023-01-11
修回日期:
2023-03-18
接受日期:
2023-03-28
发布日期:
2023-06-06
出版日期:
2024-01-10
通讯作者:
陈艳平
作者简介:
郭晓(1998—),男,山西阳泉人,硕士研究生,CCF会员,主要研究方向:自然语言处理、信息抽取;基金资助:
Xiao GUO1,2, Yanping CHEN1,2(), Ruixue TANG1,2,3, Ruizhang HUANG1,2, Yongbin QIN1,2
Received:
2023-01-11
Revised:
2023-03-18
Accepted:
2023-03-28
Online:
2023-06-06
Published:
2024-01-10
Contact:
Yanping CHEN
About author:
GUO Xiao, born in 1998, M. S. candidate. His research interests include natural language processing, information extraction.Supported by:
摘要:
随着人工智能技术在司法领域的应用,依据案情描述预测所属罪名成为一项重要研究内容。案情内容术语专业,描述言简意赅,而现有方法却往往依赖文本特征,忽略了不同案件相关要素的差异性,缺乏对案情行为词要素的有效利用。为了解决此类问题,提出一种融合行为词的罪名预测多任务学习模型。首先,由边界识别器生成行为词跨度,提炼出案情核心内容;其次,通过构建行为词的结构特征预测所属罪名;最后,将行为词识别和罪名预测进行统一建模,通过共享参数的方式增强模型的泛化能力。通过构建行为词识别和罪名预测的多任务数据集进行验证,实验结果表明该模型识别行为词任务的F值达到了83.27%,罪名预测任务的F值达到了84.29%,与BERT-CNN模型相比,分别提高了0.57%和2.61%,验证了该模型对行为词识别和罪名预测的优势。
中图分类号:
郭晓, 陈艳平, 唐瑞雪, 黄瑞章, 秦永彬. 融合行为词的罪名预测多任务学习模型[J]. 计算机应用, 2024, 44(1): 159-166.
Xiao GUO, Yanping CHEN, Ruixue TANG, Ruizhang HUANG, Yongbin QIN. Multi-task learning model for charge prediction with action words[J]. Journal of Computer Applications, 2024, 44(1): 159-166.
罪名类型 | 数据规模 | 行为词数 |
---|---|---|
故意杀人罪 | 242 | 2 493 |
故意伤害罪 | 115 | 1 052 |
贩卖毒品罪 | 21 | 200 |
运输毒品罪 | 13 | 101 |
假冒注册商标罪 | 26 | 173 |
危险驾驶罪 | 13 | 83 |
走私罪 | 97 | 823 |
抢劫罪 | 12 | 130 |
盗窃罪 | 16 | 136 |
诈骗罪 | 51 | 481 |
受贿罪 | 12 | 86 |
走私、贩卖、运输、制造毒品罪 | 81 | 626 |
表1 行为词与罪名的数量分布
Tab. 1 Number distribution of action words and charges
罪名类型 | 数据规模 | 行为词数 |
---|---|---|
故意杀人罪 | 242 | 2 493 |
故意伤害罪 | 115 | 1 052 |
贩卖毒品罪 | 21 | 200 |
运输毒品罪 | 13 | 101 |
假冒注册商标罪 | 26 | 173 |
危险驾驶罪 | 13 | 83 |
走私罪 | 97 | 823 |
抢劫罪 | 12 | 130 |
盗窃罪 | 16 | 136 |
诈骗罪 | 51 | 481 |
受贿罪 | 12 | 86 |
走私、贩卖、运输、制造毒品罪 | 81 | 626 |
参数 | 设定值 | 参数 | 设定值 |
---|---|---|---|
Batch size | 4 | 学习率 | 5×10-6 |
Epoch | 100 | 学习率预热值 | 0.1 |
Dropout | 0.5 | GCN层数 | 2 |
字向量维度 | 1 024 | 输入序列长度 | 512 |
权重衰减率 | 0.01 |
表2 参数设定
Tab. 2 Parameter setting
参数 | 设定值 | 参数 | 设定值 |
---|---|---|---|
Batch size | 4 | 学习率 | 5×10-6 |
Epoch | 100 | 学习率预热值 | 0.1 |
Dropout | 0.5 | GCN层数 | 2 |
字向量维度 | 1 024 | 输入序列长度 | 512 |
权重衰减率 | 0.01 |
模型 | 行为词识别 | 罪名预测 | ||
---|---|---|---|---|
P | R | F | ||
BERT-Attention[ | 79.33 | 81.12 | 80.22 | 80.00 |
BERT-BiGRU[ | 78.65 | 82.45 | 80.50 | 80.71 |
BERT-CNN[ | 81.18 | 84.48 | 82.80 | 82.14 |
本文模型 | 82.69 | 83.85 | 83.27 | 84.29 |
表3 不同模型实验结果对比 ( %)
Tab. 3 Experimental result comparison among different models
模型 | 行为词识别 | 罪名预测 | ||
---|---|---|---|---|
P | R | F | ||
BERT-Attention[ | 79.33 | 81.12 | 80.22 | 80.00 |
BERT-BiGRU[ | 78.65 | 82.45 | 80.50 | 80.71 |
BERT-CNN[ | 81.18 | 84.48 | 82.80 | 82.14 |
本文模型 | 82.69 | 83.85 | 83.27 | 84.29 |
方法 | ||
---|---|---|
故意杀人罪 | 故意伤害罪 | |
文本语义 | 82.47 | 57.17 |
行为词语义 | 84.85 | 60.00 |
表4 易混淆罪名预测实验结果 ( %)
Tab. 4 Experiment results of confusible charge prediction
方法 | ||
---|---|---|
故意杀人罪 | 故意伤害罪 | |
文本语义 | 82.47 | 57.17 |
行为词语义 | 84.85 | 60.00 |
模型 | 行为词识别 | ||
---|---|---|---|
P | R | F | |
BERT-IDCNN-CRF[ | 80.08 | 80.89 | 80.48 |
BERT-BIGRU-CRF[ | 81.57 | 81.83 | 81.70 |
BERT-BILSTM-Attention[ | 83.94 | 80.34 | 82.10 |
本文模型 | 82.69 | 83.85 | 83.27 |
表5 与单任务模型识别行为词对比实验结果 ( %)
Tab. 5 Comparative experiment results of action word recognition with single task models
模型 | 行为词识别 | ||
---|---|---|---|
P | R | F | |
BERT-IDCNN-CRF[ | 80.08 | 80.89 | 80.48 |
BERT-BIGRU-CRF[ | 81.57 | 81.83 | 81.70 |
BERT-BILSTM-Attention[ | 83.94 | 80.34 | 82.10 |
本文模型 | 82.69 | 83.85 | 83.27 |
模型 | 罪名预测Fmicro |
---|---|
TextRNN-Attention[ | 70.71 |
FastText[ | 77.14 |
TextCNN[ | 80.71 |
本文模型 | 84.29 |
表6 与单任务模型预测罪名对比实验结果 ( %)
Tab. 6 Comparative experiment results of predict charge with single task models
模型 | 罪名预测Fmicro |
---|---|
TextRNN-Attention[ | 70.71 |
FastText[ | 77.14 |
TextCNN[ | 80.71 |
本文模型 | 84.29 |
模型 | 行为词识别 | 罪名预测Fmicro | ||
---|---|---|---|---|
P | R | F | ||
本文模型 | 82.69 | 83.85 | 83.27 | 84.29 |
-相关性构建 | 82.00 | 83.15 | 82.57 | 82.14 |
-边界识别 | 76.65 | 81.67 | 79.08 | 83.57 |
表7 消融实验结果 ( %)
Tab. 7 Ablation experiment results
模型 | 行为词识别 | 罪名预测Fmicro | ||
---|---|---|---|---|
P | R | F | ||
本文模型 | 82.69 | 83.85 | 83.27 | 84.29 |
-相关性构建 | 82.00 | 83.15 | 82.57 | 82.14 |
-边界识别 | 76.65 | 81.67 | 79.08 | 83.57 |
1 | WEI B, KUANG K, SUN C, et al. A full-process intelligent trial system for smart court [J]. Frontiers of Information Technology & Electronic Engineering, 2022, 23: 186-206. 10.1631/fitee.2100041 |
2 | KORT F. Predicting Supreme Court decisions mathematically: A quantitative analysis of the “right to counsel” cases [J]. American Political Science Review, 1957, 51(1): 1-12. 10.2307/1951767 |
3 | NAGEL S S. Applying correlation analysis to case prediction [J]. Texas Law Review, 1963, 42: 1006. |
4 | 于游,付钰,吴晓平.中文文本分类方法综述[J].网络与信息安全学报, 2019, 5(5): 1-8. 10.11959/j.issn.2096-109x.2019045 |
YU Y, FU Y, WU X P. Summary of text classification methods [J]. Chinese Journal of Network and Information Security, 2019, 5(5): 1-8. 10.11959/j.issn.2096-109x.2019045 | |
5 | 白昌前,代晓,张岸.基于数据增强和改进BERT的罪名预测[J].电脑与信息技术, 2023, 31(1): 37-40. 10.3969/j.issn.1005-1228.2023.01.012 |
BAI C Q, DAI X, ZHANG A. Crime prediction based on data enhancement and improved BERT [J]. Computer and Information Technology, 2023, 31(1): 37-40. 10.3969/j.issn.1005-1228.2023.01.012 | |
6 | 彭韬,杨亮,张琍,等.联合多源分析的罪名预测研究[J].计算机工程与应用, 2023, 59(4): 290-296. 10.3778/j.issn.1002-8331.2108-0339 |
PENG T, YANG L, ZHANG L, et al. Research on charge prediction based on multi-source joint analysis [J]. Computer Engineering and Applications, 2023, 59(4): 290-296. 10.3778/j.issn.1002-8331.2108-0339 | |
7 | HU Z, LI X, TU C, et al. Few-shot charge prediction with discriminative legal attributes [C]// Proceedings of the 27th International Conference on Computational Linguistics. Stroudsburg, PA: Association for Computational Linguistics, 2018: 487-498. |
8 | 刘宗林,张梅山,甄冉冉,等.融入罪名关键词的法律判决预测多任务学习模型[J].清华大学学报(自然科学版), 2019, 59(7): 497-504. |
LIU Z L, ZHANG M S, ZHEN R R, et al. Multi-task learning model for legal judgment predictions with charge keywords [J]. Journal of Tsinghua University (Science and Technology), 2019, 59(7): 497-504. | |
9 | 李婷,秦永彬,黄瑞章,等.基于神经网络的中文谓语动词识别研究[J].数据采集与处理, 2020, 35(3): 582-590. |
LI T, QIN Y B, HUANG R Z, et al. Research on Chinese predicate verb recognition based on neural network [J]. Journal of Data Acquisition & Processing, 2020, 35(3): 582-590. | |
10 | DEVLIN J, CHANG M-W, LEE K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding [EB/OL]. [2019-05-24]. . 10.18653/v1/n18-2 |
11 | SCHUSTER M, PALIWAL K K. Bidirectional recurrent neural networks [J]. IEEE Transactions on Signal Processing, 1997, 45(11): 2673-2681. 10.1109/78.650093 |
12 | KIPF T N, WELLING M. Semi-supervised classification with graph convolutional networks [EB/OL]. [2019-09-09]. . 10.48550/arXiv.1609.02907 |
13 | ULMER S S. Quantitative analysis of judicial processes: Some practical and theoretical applications [J]. Law and Contemporary Problems, 1963, 28(1): 164-184. 10.2307/1190728 |
14 | KEOWN R. Mathematical models for legal prediction [J]. Computer/Law Journal, 1980, 2: 829-862. 10.1016/0270-0255(80)90042-1 |
15 | LIU C-L, CHANG C-T, J-H HO. Case instance generation and refinement for case-based criminal summary judgments in Chinese [J]. Journal of Information Science and Engineering, 2004, 20: 783-800. 10.1145/1047788.1047840 |
16 | BIJALWAN V, KUMARI V, KUMARI P, et al. KNN based machine learning approach for text and document mining [J]. International Journal of Database Theory and Application, 2014, 7(1): 61-70. 10.14257/ijdta.2014.7.1.06 |
17 | O-M SULEA, ZAMPIERI M, MALMASI S, et al. Exploring the use of text classification in the legal domain [EB/OL]. [2017-10-25]. . 10.1109/icds47004.2019.8942343 |
18 | ZHANG H, BERG A C, MAIRE M, et al. SVM-KNN: Discriminative nearest neighbor classification for visual category recognition [C]// Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2006: 2126-2136. |
19 | LIU Y-H, CHEN Y-L, W-L HO. Predicting associated statutes for legal problems [J]. Information Processing & Management, 2015, 51(1): 194-211. 10.1016/j.ipm.2014.07.003 |
20 | LIN W-C, T-T KUO, CHANG T-J, et al. Exploiting machine learning models for Chinese legal documents labeling, case classification, and sentencing prediction [J]. Computational Linguistic and Chinese Language Processing, 2012, 17(4): 49-68. |
21 | LAFFERTY J D, McCALLUM A, PEREIRA F C. Conditional random fields: Probabilistic models for segmenting and labeling sequence data [C]// Proceedings of the 18th International Conference on Machine Learning. San Francisco: Morgan Kaufmann Publishers Inc., 2001: 282-289. |
22 | LUO B, FENG Y, XU J, et al. Learning to predict charges for criminal cases with legal basis [EB/OL]. [2017-07-28]. . 10.18653/v1/d17-1289 |
23 | LONG S, TU C, LIU Z, et al. Automatic judgment prediction via legal reading comprehension [C]// Proceedings of the 2019 China National Conference on Chinese Computational Linguistics. Cham: Springer, 2019: 558-572. 10.1007/978-3-030-32381-3_45 |
24 | YE H, JIANG X, LUO Z, et al. Interpretable charge predictions for criminal cases: Learning to generate court views from fact descriptions [EB/OL]. [2018-02-23]. . 10.18653/v1/n18-1168 |
25 | 赵琪珲,李大鹏,高天寒,等.基于图注意力网络的案件罪名预测方法: CP-GAT [J].东北大学学报(自然科学版), 2021, 42(12): 1681-1687. |
ZHAO Q H, LI D P, GAO T H, et al. A charge prediction method based on graph attention network: CP-GAT [J]. Journal of Northeastern University (Natural Science), 2021, 42(12): 1681-1687. | |
26 | 孙倩,秦永彬,黄瑞章,等.结合案件要素序列的罪名预测方法[J].大数据, 2021, 7(6): 30-40. 10.11959/j.issn.2096-0271.2021058 |
SUN Q, QIN Y B, HUANG R Z, et al. Charge prediction method combined with case elements sequence [J]. Big Data Research, 2021, 7(6): 30-40. 10.11959/j.issn.2096-0271.2021058 | |
27 | 倪晴超,殷聪珏,赵冬华.基于BERT和关键词的属性-罪名分类[J].计算机应用, 2021, 41(S2): 36-40. |
NI Q C, YIN C J, ZHAO D H. Classification of attributes and charges based on BERT and keywords [J]. Journal of Computer Applications, 2021, 41(S2): 36-40. | |
28 | 王卓越,陈彦光,邢铁军,等.基于多任务学习的多罪名案件信息联合抽取[J].计算机工程与应用, 2023, 59(2): 178-184. 10.3778/j.issn.1002-8331.2108-0344 |
WANG Z Y, CHEN Y G, XING T J, et al. Joint entity and relation extraction for multi-crime legal documents with multi-task learning [J]. Computer Engineering and Applications, 2023, 59(2): 178-184. 10.3778/j.issn.1002-8331.2108-0344 | |
29 | HENDRYCKS D, GIMPEL K. Gaussian Error Linear Units (GELUs) [EB/OL]. [2020-07-08]. . |
30 | GIRSHICK R. Fast R-CNN [C]// Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2015: 1440-1448. 10.1109/iccv.2015.169 |
31 | CHEN Y, JIN W, QIN Y, et al. Annotation of Chinese predicate heads and relevant elements [EB/OL]. [2021-04-01]. . |
32 | 毛明毅,吴晨,钟义信,等.加入自注意力机制的BERT命名实体识别模型[J].智能系统学报, 2020, 15(4): 772-779. 10.11992/tis.202003003 |
MAO M Y, WU C, ZHONG Y X, et al. BERT named entity recognition model with self-attention mechanism [J]. CAAI Transactions on Intelligent Systems, 2020, 15(4): 772-779. 10.11992/tis.202003003 | |
33 | YU Q, WANG Z, JIANG K. Research on text classification based on BERT-BiGRU model [J]. Journal of Physics: Conference Series, 2021, 1746: 012019. 10.1088/1742-6596/1746/1/012019 |
34 | 史振杰,董兆伟,庞超逸,等.基于BERT-CNN的电商评论情感分析[J].智能计算机与应用, 2020, 10(2): 7-11. 10.3969/j.issn.2095-2163.2020.02.002 |
SHI Z J, DONG Z W, PANG C Y, et al. Sentiment analysis of e-commerce reviews based on BERT-CNN [J]. Intelligent Computer and Applications, 2020, 10(2): 7-11. 10.3969/j.issn.2095-2163.2020.02.002 | |
35 | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need [C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook, NY: Curran Associates Inc., 2017: 6000-6010. |
36 | CAI X, SUN E, LEI J. Research on application of named entity recognition of electronic medical records based on BERT-IDCNN-CRF model [C]// Proceedings of the 6th International Conference on Graphics and Signal Processing. New York: ACM, 2022: 80-85. 10.1145/3561518.3561531 |
37 | QIN Q, ZHAO S, LIU C. A BERT-BiGRU-CRF model for entity recognition of Chinese electronic medical records [J]. Complexity, 2021, 2021: 6631837. 10.1155/2021/6631837 |
38 | 倪健,陈鹏兴.基于Bert-BiLSTM-Attention的互联网金融实体识别方法[J].信息与电脑, 2021, 33(20): 58-61. 10.3969/j.issn.1003-9767.2021.20.018 |
NI J, CHEN P X. Internet financial entity recognition method based on BiLSTM-attention [J]. China Computer & Communication, 2021, 33(20): 58-61. 10.3969/j.issn.1003-9767.2021.20.018 | |
39 | LIU P, QIU X, HUANG X. Recurrent neural network for text classification with multi-task learning [EB/OL]. [2016-05-17]. . 10.18653/v1/d16-1012 |
40 | JOULIN A, GRAVE E, BOJANOWSKI P, et al. Bag of tricks for efficient text classification [EB/OL]. [2016-08-09]. . 10.18653/v1/e17-2068 |
41 | ZHANG Y, WALLACE B. A sensitivity analysis of (and practitioners’ guide to) convolutional neural networks for sentence classification [EB/OL]. [2016-04-06]. . 10.18653/v1/d16-1076 |
[1] | 张春雪, 仇丽青, 孙承爱, 荆彩霞. 基于两阶段动态兴趣识别的购买行为预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2365-2371. |
[2] | 沈君凤, 周星辰, 汤灿. 基于改进的提示学习方法的双通道情感分析模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1796-1806. |
[3] | 姚迅, 秦忠正, 杨捷. 生成式标签对抗的文本分类模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1781-1785. |
[4] | 李威, 陈玲, 徐修远, 朱敏, 郭际香, 周凯, 牛颢, 张煜宸, 易珊烨, 章毅, 罗凤鸣. 基于多任务学习的间质性肺病分割算法[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1285-1293. |
[5] | 尚爱国, 朱欣娟. 基于多任务学习的意图检测和槽位填充联合方法[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 690-695. |
[6] | 王星, 刘贵娟, 陈志豪. 高斯混合模型与文本图卷积网络结合的虚假评论识别算法[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 360-368. |
[7] | 宋钰丹, 王晶, 王雪徽, 马朝阳, 林友芳. 基于自适应多任务学习的睡眠生理时序分类方法[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 654-662. |
[8] | 廖存燚, 郑毅, 刘玮瑾, 于欢, 刘守印. 自动驾驶环境感知多任务去耦-融合算法[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 424-431. |
[9] | 李豆豆, 李汪根, 夏义春, 束阳, 高坤. 基于特征交互与自适应融合的骨骼动作识别[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2581-2587. |
[10] | 何嘉明, 杨巨成, 吴超, 闫潇宁, 许能华. 基于多模态图卷积神经网络的行人重识别方法[J]. 《计算机应用》唯一官方网站, 2023, 43(7): 2182-2189. |
[11] | 何建辉, 胡春龙, 束鑫. 基于多峰标签分布学习的多任务年龄估计方法[J]. 《计算机应用》唯一官方网站, 2023, 43(5): 1578-1583. |
[12] | 樊小宇, 蔺素珍, 王彦博, 刘峰, 李大威. 基于残差图卷积神经网络的高倍欠采样核磁共振图像重建算法[J]. 《计算机应用》唯一官方网站, 2023, 43(4): 1261-1268. |
[13] | 王若莹, 吕凡, 赵柳清, 胡伏原. 融合用户需求和边界约束的平面图生成算法[J]. 《计算机应用》唯一官方网站, 2023, 43(2): 575-582. |
[14] | 陈颖, 于炯, 陈嘉颖, 杜旭升. 基于交叉层级数据共享的多任务模型[J]. 《计算机应用》唯一官方网站, 2022, 42(5): 1447-1454. |
[15] | 陈浩杰, 范江亭, 刘勇. 深度强化学习解决动态旅行商问题[J]. 《计算机应用》唯一官方网站, 2022, 42(4): 1194-1200. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||