Journal of Computer Applications ›› 2024, Vol. 44 ›› Issue (1): 159-166.DOI: 10.11772/j.issn.1001-9081.2023010029

• Artificial intelligence • Previous Articles    

Multi-task learning model for charge prediction with action words

Xiao GUO1,2, Yanping CHEN1,2(), Ruixue TANG1,2,3, Ruizhang HUANG1,2, Yongbin QIN1,2   

  1. 1.State Key Laboratory of Public Big Data (Guizhou University),Guiyang Guizhou 550025,China
    2.College of Computer Science and Technology,Guizhou University,Guiyang Guizhou 550025,China
    3.School of Information,Guizhou University of Finance and Economics,Guiyang Guizhou 550025,China
  • Received:2023-01-11 Revised:2023-03-18 Accepted:2023-03-28 Online:2023-06-06 Published:2024-01-10
  • Contact: Yanping CHEN
  • About author:GUO Xiao, born in 1998, M. S. candidate. His research interests include natural language processing, information extraction.
    TANG Ruixue, born in 1987, Ph. D. candidate. Her research interests include natural language processing.
    HUANG Ruizhang, born in 1979, Ph. D., professor. Her research interests include text mining, data fusion.
    QIN Yongbin, born in 1980, Ph. D., professor. His research interests include enterprise informatization, e-government.
  • Supported by:
    National Natural Science Foundation of China(62166007);Youth Science and Technology Talents Growth Project of Guizhou Education Department(KY[2022]205)

融合行为词的罪名预测多任务学习模型

郭晓1,2, 陈艳平1,2(), 唐瑞雪1,2,3, 黄瑞章1,2, 秦永彬1,2   

  1. 1.公共大数据国家重点实验室(贵州大学), 贵阳 550025
    2.贵州大学 计算机科学与技术学院, 贵阳 550025
    3.贵州财经大学 信息学院, 贵阳 550025
  • 通讯作者: 陈艳平
  • 作者简介:郭晓(1998—),男,山西阳泉人,硕士研究生,CCF会员,主要研究方向:自然语言处理、信息抽取;
    唐瑞雪(1987—),女,贵州贵阳人,博士研究生,主要研究方向:自然语言处理;
    黄瑞章(1979—),女,天津人,教授,博士,CCF会员,主要研究方向:文本挖掘、数据融合;
    秦永彬(1980—),男,山东招远人,教授,博士,CCF会员,主要研究方向:企业信息化、电子政务。
    第一联系人:陈艳平(1980—),男,贵州长顺人,教授,博士,CCF会员,主要研究方向:人工智能、自然语言处理;
  • 基金资助:
    国家自然科学基金资助项目(62166007);贵州省教育厅青年科技人才成长项目(黔教合KY字[2022]205号)

Abstract:

With the application of artificial intelligence technology in the judicial field, charge prediction based on case description has become an important research content. It aims at predicting the charges according to the case description. The terms of case contents are professional, and the description is concise and rigorous. However, the existing methods often rely on text features, but ignore the difference of relevant elements and lack effective utilization of elements of action words in diverse cases. To solve the above problems, a multi-task learning model of charge prediction based on action words was proposed. Firstly, the spans of action words were generated by boundary identifier, and then the core contents of the case were extracted. Secondly, the subordinate charge was predicted by constructing the structure features of action words. Finally, identification of action words and charge prediction were uniformly modeled, which enhanced the generalization of the model by sharing parameters. A multi-task dataset with action word identification and charge prediction was constructed for model verification. The experimental results show that the proposed model achieves the F value of 83.27% for action word identification task, and the F value of 84.29% for charge prediction task; compared with BERT-CNN, the F value respectively increases by 0.57% and 2.61%, which verifies the advantage of the proposed model in identification of action words and charge prediction.

Key words: charge prediction, action word, boundary identification, graph convolution neural network, multi-task learning

摘要:

随着人工智能技术在司法领域的应用,依据案情描述预测所属罪名成为一项重要研究内容。案情内容术语专业,描述言简意赅,而现有方法却往往依赖文本特征,忽略了不同案件相关要素的差异性,缺乏对案情行为词要素的有效利用。为了解决此类问题,提出一种融合行为词的罪名预测多任务学习模型。首先,由边界识别器生成行为词跨度,提炼出案情核心内容;其次,通过构建行为词的结构特征预测所属罪名;最后,将行为词识别和罪名预测进行统一建模,通过共享参数的方式增强模型的泛化能力。通过构建行为词识别和罪名预测的多任务数据集进行验证,实验结果表明该模型识别行为词任务的F值达到了83.27%,罪名预测任务的F值达到了84.29%,与BERT-CNN模型相比,分别提高了0.57%和2.61%,验证了该模型对行为词识别和罪名预测的优势。

关键词: 罪名预测, 行为词, 边界识别, 图卷积神经网络, 多任务学习

CLC Number: