Journal of Computer Applications ›› 2019, Vol. 39 ›› Issue (8): 2450-2455.DOI: 10.11772/j.issn.1001-9081.2019010033

• Frontier & interdisciplinary applications • Previous Articles     Next Articles

Regional bullying recognition based on joint hierarchical attentional network and independent recurrent neural network

MENG Zhao1, TIAN Shengwei1, YU Long2, WANG Ruijin3   

  1. 1. School of Software, Xinjiang University, Urumqi Xinjiang 830008, China;;
    2. Network Center, Xinjiang University, Urumqi Xinjiang 830046, China;;
    3. School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu Sichuan 611731, China
  • Received:2019-01-07 Revised:2019-03-05 Online:2019-08-10 Published:2019-04-15
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (61563051, 61662074, 61262064), the Key Project of National Natural Science Foundation of China (61331011), the Xinjiang Uygur Autonomous Region Scientific and Technological Personnel Training Project (QN2016YX0051), the Xinjiang Tianshan Youth Plan Project (2017Q011).

联合分层注意力网络和独立循环神经网络的地域欺凌识别

孟曌1, 田生伟1, 禹龙2, 王瑞锦3   

  1. 1. 新疆大学 软件学院, 乌鲁木齐 830008;
    2. 新疆大学 网络中心, 乌鲁木齐 830046;
    3. 电子科技大学 信息与软件工程学院, 成都 611731
  • 通讯作者: 田生伟
  • 作者简介:孟曌(1994-),女,山西长治人,硕士研究生,主要研究方向:人工智能、自然语言处理;田生伟(1973-),男,新疆乌鲁木齐人,教授,博士生导师,博士,主要研究方向:人工智能、大数据分析、信息安全;禹龙(1974-),女,新疆乌鲁木齐人,教授,博士生导师,硕士,主要研究方向:网络空间、大数据分析、信息安全;王瑞锦(1980-),男,四川成都人,讲师,博士,主要研究方向:量子通信、大数据分析及安全。
  • 基金资助:
    国家自然科学基金资助项目(61662074,61563051,61262064);国家自然科学基金重点项目(61331011);新疆维吾尔自治区科技人才培养项目(QN2016YX0051);天山青年计划项目(2017Q011)。

Abstract: In order to improve the utilization efficiency of deep information in text context, based on Hierarchical Attention Network (HAN) and Independent Recurrent Neural Network (IndRNN), a regional bullying semantic recognition model called HACBI (HAN_CNN_BiLSTM_IndRNN) was proposed. Firstly, the manually annotated regional bullying texts were mapped into a low-dimensional vector space by means of word embedding technology. Secondly, the local and global semantic information of bullying texts was extracted by using Convolutional Neural Network (CNN) and Bidirectional Long Short-Term Memory (BiLSTM), and internal structure information of text was captured by HAN. Finally, in order to avoid the loss of text hierarchy information and solve the gradient disappearance problem, IndRNN was introduced to enhance the description ability of model, which achieved the integration of information flow. Experimental results show that the Accuracy (Acc), Precision (P), Recall (R), F1 (F1-Measure) and AUC (Area Under Curve) values are 99.57%, 98.54%, 99.02%, 98.78% and 99.35% respectively of this model, which indicates that the effectiveness provided by HACBI is significantly improved compared to text classification models such as Support Vector Machine (SVM) and CNN.

Key words: regional bullying, structural information, Hierarchical Attention Network (HAN), Independent Recurrent Neural Network (IndRNN), word vector, context

摘要: 为提高对文本语境深层次信息的利用效率,提出了联合分层注意力网络(HAN)和独立循环神经网络(IndRNN)的地域欺凌文本识别模型——HACBI。首先,将手工标注的地域欺凌文本通过词嵌入技术映射到低维向量空间中;其次,借助卷积神经网络(CNN)和双向长短期记忆网络(BiLSTM)提取地域欺凌文本的局部及全局语义特征,并进一步利用HAN捕获文本的内部结构信息;最后,为避免文本层次结构信息丢失和解决梯度消失等问题,引入IndRNN以增强模型的描述能力,并实现信息流的整合。实验结果表明,该模型的准确率(Acc)、精确率(P)、召回率(R)、F1和AUC值分别为99.57%、98.54%、99.02%、98.78%和99.35%,相比支持向量机(SVM)、CNN等文本分类模型有显著提升。

关键词: 地域欺凌, 结构信息, 分层注意力网络, 独立循环神经网络, 词向量, 语境

CLC Number: