《计算机应用》唯一官方网站 ›› 2024, Vol. 44 ›› Issue (2): 403-410.DOI: 10.11772/j.issn.1001-9081.2023030270

• 人工智能 • 上一篇    

基于改进分层注意网络和TextCNN联合建模的暴力犯罪分级算法

张家伟1,2, 高冠东2,3(), 肖珂1,4, 宋胜尊5   

  1. 1.河北农业大学 信息科学与技术学院, 河北 保定 071000
    2.中央司法警官学院 数据科学与智能矫正技术研究中心, 河北 保定 071000
    3.中央司法警官学院 信息管理系, 河北 保定 071000
    4.河北省农业大数据重点实验室(河北农业大学), 河北 保定 071000
    5.中央司法警官学院 监狱学学院, 河北 保定 071000
  • 收稿日期:2023-03-16 修回日期:2023-05-09 接受日期:2023-05-11 发布日期:2023-05-23 出版日期:2024-02-10
  • 通讯作者: 高冠东
  • 作者简介:张家伟(1998—),男,河北邯郸人,硕士研究生,主要研究方向:自然语言处理、犯罪心理画像
    肖珂(1980—),女,四川内江人,教授,博士,CCF会员,主要研究方向:农业工程、物联网、机器视觉
    宋胜尊(1967—),女,河北深州人,教授,硕士,主要研究方向:犯罪心理学、服刑人员情绪管理、情感计算。
  • 基金资助:
    河北省社会科学基金资助项目(HB21ZZ002)

Violent crime hierarchy algorithm by joint modeling of improved hierarchical attention network and TextCNN

Jiawei ZHANG1,2, Guandong GAO2,3(), Ke XIAO1,4, Shengzun SONG5   

  1. 1.College of Information Science and Technology,Hebei Agricultural University,Baoding Hebei 071000,China
    2.The Centre of Data Science and Intelligent Correction Technology,The National Police University for Criminal Justice,Baoding Hebei 071000,China
    3.Department of Information Management,The National Police University for Criminal Justice,Baoding Hebei 071000,China
    4.Hebei Key Laboratory of Agricultural Big Data (Hebei Agricultural University),Baoding Hebei 071000,China
    5.Department of Penology,The National Police University for Criminal Justice,Baoding Hebei 071000,China
  • Received:2023-03-16 Revised:2023-05-09 Accepted:2023-05-11 Online:2023-05-23 Published:2024-02-10
  • Contact: Guandong GAO
  • About author:ZHANG Jiawei, born in 1998, M. S. candidate. His research interests include natural language processing, criminal psychological portrait.
    XIAO Ke, born in 1980, Ph. D., professor. Her research interests include agricultural engineering, internet of things, machine vision.
    SONG Shengzun, born in 1967, M. S., professor. Her research interests include criminal psychology, prisoner’s emotional management, affective computing.
  • Supported by:
    Social Science Foundation of Hebei Province(HB21ZZ002)

摘要:

为了科学、智能地对服刑人员的暴力倾向分级,将自然语言处理(NLP)中的文本分类方法引入犯罪心理学领域,提出一种基于改进分层注意网络(HAN)与TextCNN(Text Convolutional Neural Network)两通道联合建模的犯罪语义卷积分层注意网络(CCHA-Net),通过分别挖掘犯罪事实与服刑人员基本情况的语义信息,完成暴力犯罪气质分级。首先,采用Focal Loss同时替代两通道中的Cross-Entropy函数,优化样本数量不均衡问题。其次,在两通道输入层中,同时引入位置编码,改进对位置信息的感知能力;改进HAN通道,采用最大池化构建显著向量。最后,输出层都采用全局平均池化替代全连接方法,以避免过拟合。实验结果表明,与AC-BiLSTM(Attention-based Bidirectional Long Short-Term Memory with Convolution layer)、支持向量机(SVM)等17种相关基线模型相比,CCHA-Net各项指标均最优,微平均F1(Micro_F1)为99.57%,宏平均和微平均下的曲线下面积(AUC)分别为99.45%和99.89%,相较于次优的AC-BiLSTM提高了4.08、5.59和0.74个百分点,验证了CCHA-Net能有效胜任暴力犯罪气质分级任务。

关键词: 深度学习, 文本分类, 卷积神经网络, 分层注意网络, 暴力犯罪分级, 气质类型

Abstract:

A text classification method in Natural Language Processing (NLP) was introduced into the field of criminal psychology to scientifically and intelligently grade the violent tendencies of prisoners. A Criminal semantic Convolutional Hierarchical Attention Network (CCHA-Net) based on the joint modeling of two channels of improved HAN (Hierarchy Attention Network) and TextCNN (Text Convolutional Neural Network) was proposed to complete the violent criminal temperament grade by separately mining the semantic information of crime facts and basic information of prisoners. Firstly, Focal Loss was used to simultaneously replace the Cross-Entropy function in both channels to optimize the sample size imbalance problem. Secondly, in the two-channel input layer, positional encoding was simultaneously introduced to improve the perception of positional information. The HAN channel was improved by using max-pooling to construct salient vectors. Finally, global average pooling was used to replace the fully connected method in all output layers to avoid overfitting. Experimental results show that compared with 17 related baseline models such as AC-BiLSTM (Attention-based Bidirectional Long Short-Term Memory with Convolution layer) and Support Vector Machine (SVM), the indicators of CCHA-Net reach the best, the micro-average F1 (Micro_F1) is 99.57%, and the Area Under the Curve (AUC) under the macro-average and the micro-average are 99.45% and 99.89%, respectively, which are 4.08, 5.59 and 0.74 percentage points higher than those of the suboptimal AC-BiLSTM. It can be verified that the violent criminal temperament grade task can be effectively performed by CCHA-Net.

Key words: deep learning, text classification, Convolutional Neural Network (CNN), Hierarchical Attention Network (HAN), hierarchy of violent crime, temperament type

中图分类号: