Regional bullying recognition based on joint hierarchical attentional network and independent recurrent neural network

doi:10.11772/j.issn.1001-9081.2019010033

Journal of Computer Applications ›› 2019, Vol. 39 ›› Issue (8): 2450-2455.DOI: 10.11772/j.issn.1001-9081.2019010033

• Frontier & interdisciplinary applications • Previous Articles Next Articles

Regional bullying recognition based on joint hierarchical attentional network and independent recurrent neural network

MENG Zhao¹, TIAN Shengwei¹, YU Long², WANG Ruijin³

1. School of Software, Xinjiang University, Urumqi Xinjiang 830008, China;;
2. Network Center, Xinjiang University, Urumqi Xinjiang 830046, China;;
3. School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu Sichuan 611731, China

Received:2019-01-07 Revised:2019-03-05 Online:2019-08-10 Published:2019-04-15
Supported by:
This work is partially supported by the National Natural Science Foundation of China (61563051, 61662074, 61262064), the Key Project of National Natural Science Foundation of China (61331011), the Xinjiang Uygur Autonomous Region Scientific and Technological Personnel Training Project (QN2016YX0051), the Xinjiang Tianshan Youth Plan Project (2017Q011).

联合分层注意力网络和独立循环神经网络的地域欺凌识别

孟曌¹, 田生伟¹, 禹龙², 王瑞锦³

1. 新疆大学软件学院, 乌鲁木齐 830008;
2. 新疆大学网络中心, 乌鲁木齐 830046;
3. 电子科技大学信息与软件工程学院, 成都 611731

通讯作者: 田生伟
作者简介:孟曌(1994-),女,山西长治人,硕士研究生,主要研究方向:人工智能、自然语言处理;田生伟(1973-),男,新疆乌鲁木齐人,教授,博士生导师,博士,主要研究方向:人工智能、大数据分析、信息安全;禹龙(1974-),女,新疆乌鲁木齐人,教授,博士生导师,硕士,主要研究方向:网络空间、大数据分析、信息安全;王瑞锦(1980-),男,四川成都人,讲师,博士,主要研究方向:量子通信、大数据分析及安全。
基金资助:
国家自然科学基金资助项目（61662074，61563051，61262064）；国家自然科学基金重点项目（61331011）；新疆维吾尔自治区科技人才培养项目（QN2016YX0051）；天山青年计划项目（2017Q011）。

Abstract

Abstract: In order to improve the utilization efficiency of deep information in text context, based on Hierarchical Attention Network (HAN) and Independent Recurrent Neural Network (IndRNN), a regional bullying semantic recognition model called HACBI (HAN_CNN_BiLSTM_IndRNN) was proposed. Firstly, the manually annotated regional bullying texts were mapped into a low-dimensional vector space by means of word embedding technology. Secondly, the local and global semantic information of bullying texts was extracted by using Convolutional Neural Network (CNN) and Bidirectional Long Short-Term Memory (BiLSTM), and internal structure information of text was captured by HAN. Finally, in order to avoid the loss of text hierarchy information and solve the gradient disappearance problem, IndRNN was introduced to enhance the description ability of model, which achieved the integration of information flow. Experimental results show that the Accuracy (Acc), Precision (P), Recall (R), F1 (F1-Measure) and AUC (Area Under Curve) values are 99.57%, 98.54%, 99.02%, 98.78% and 99.35% respectively of this model, which indicates that the effectiveness provided by HACBI is significantly improved compared to text classification models such as Support Vector Machine (SVM) and CNN.

Key words: regional bullying, structural information, Hierarchical Attention Network (HAN), Independent Recurrent Neural Network (IndRNN), word vector, context

摘要： 为提高对文本语境深层次信息的利用效率，提出了联合分层注意力网络（HAN）和独立循环神经网络（IndRNN）的地域欺凌文本识别模型——HACBI。首先，将手工标注的地域欺凌文本通过词嵌入技术映射到低维向量空间中；其次，借助卷积神经网络（CNN）和双向长短期记忆网络（BiLSTM）提取地域欺凌文本的局部及全局语义特征，并进一步利用HAN捕获文本的内部结构信息；最后，为避免文本层次结构信息丢失和解决梯度消失等问题，引入IndRNN以增强模型的描述能力，并实现信息流的整合。实验结果表明，该模型的准确率（Acc）、精确率（P）、召回率（R）、F1和AUC值分别为99.57%、98.54%、99.02%、98.78%和99.35%，相比支持向量机（SVM）、CNN等文本分类模型有显著提升。

关键词: 地域欺凌, 结构信息, 分层注意力网络, 独立循环神经网络, 词向量, 语境

CLC Number:

TP391
TP181

MENG Zhao, TIAN Shengwei, YU Long, WANG Ruijin. Regional bullying recognition based on joint hierarchical attentional network and independent recurrent neural network[J]. Journal of Computer Applications, 2019, 39(8): 2450-2455.

孟曌, 田生伟, 禹龙, 王瑞锦. 联合分层注意力网络和独立循环神经网络的地域欺凌识别[J]. 计算机应用, 2019, 39(8): 2450-2455.

References

[1] HU K, WU H, QI K, et al. A domain keyword analysis approach extending term frequency-keyword active index with Google Word2Vec model[J]. Scientometrics, 2018, 114(3):1031-1068.
[2] CHEN M, LIU W, YANG Z, et al. Automatic prosodic events detection using a two-stage SVM/CRF sequence classifier with acoustic features[C]//Proceedings of the 2012 Chinese Conference on Pattern Recognition, CCIS 321. Berlin:Springer, 2012:572-578.
[3] ASHKTORAB Z, HABER E, GOLBECK J, et al. Beyond cyberbullying:self-disclosure, harm and social support on ASKfm[C]//Proceedings of the 2017 ACM on Web Science Conference. New York:ACM, 2017:3-12.
[4] BURNAP P, COLOMBO G, AMERY R, et al. Multi-class machine classification of suicide-related communication on Twitter[J]. Online Social Networks and Media, 2017, 2:32-44.
[5] ZHOU Y T, DU Z G, ZHANG D, et al. Retrospective observational study about reducing the false negative rate of the sentinel lymph node biopsy:never underestimate the effect of subjective factors[J]. Medicine, 2017, 96(34):e7787.
[6] DADVAR M, TRIESCHNINGG D, de JONG F. Experts and machines against bullies:a hybrid approach to detect cyberbullies[C]//Proceedings of the 27th Canadian Conference on Artificial Intelligence, LNCS 8436. Cham:Springer, 2014:275-281.
[7] FIRUZI K, VAKILIAN M, DARABAD V P, et al. A novel method for differentiating and clustering multiple partial discharge sources using S transform and bag of words feature[J]. IEEE Transactions on Dielectrics and Electrical Insulation, 2018, 24(6):3694-3702.
[8] COLLIER N, NOBATA C, TSUJⅡ J. Automatic acquisition and classification of terminology using a tagged corpus in the molecular biology domain[J]. Terminology, 2001, 7(2):239-257.
[9] DJURIC N, ZHOU J, MORRIS R, et al. Hate speech detection with comment embeddings[C]//Proceedings of the 24th International Conference on World Wide Web. New York:ACM, 2015:29-30.
[10] WIJERATNE S, DORAN D, SHETH A, et al. Analyzing the social media footprint of street gangs[C]//ISI 2015:Proceedings of the 2015 IEEE International Conference on Intelligence and Security Informatics. Piscataway, NJ:IEEE, 2015:91-96.
[11] GITARI N D, ZUPING Z, DAMIEN H, et al. A lexicon-based approach for hate speech detection[J]. International Journal of Multimedia and Ubiquitous Engineering, 2015, 10(4):215-230.
[12] MISHRA M K, KUMAR S, VAISH A, et al. Quantifying degree of cyber bullying using level of information shared and associated trust[C]//Proceedings of the 2015 Annual IEEE India Conference. Piscataway, NJ:IEEE, 2015:1-6.
[13] OGUZLAR A. With R programming, comparison of performance of different machine learning algorithms[J]. European Journal of Multidisciplinary Studies, 2018, 3(2):172-172.
[14] YANG Z, YANG D, DYER C, et al. Hierarchical attention networks for document classification[C]//Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies. Stroudsburg, PA:Association for Computational Linguistics, 2016:1480-1489.
[15] 李洋,董红斌.基于CNN和BiLSTM网络特征融合的文本情感分析[J].计算机应用,2018,38(11):3075-3080. (LI Y, DONG H B. Text sentiment analysis based on feature fusion of convolution neural network and bidirectional long short-term memory network[J]. Journal of Computer Applications, 2018, 38(11):3075-3080.)
[16] LI S, LI W, COOK C, et al. Independently recurrent neural network (IndRNN):building a longer and deeper RNN[C]//Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. Washington, DC:IEEE Computer Society, 2018:5457-5466.
[17] 王建华,周明强,盛爱萍.现代汉语语境研究[M].杭州:浙江大学出版社,2002:59. (WANG J H, ZHOU M Q, SHENG A P. On the Context of Modern Chinese[M]. Hangzhou:Zhejiang University Press, 2002:59.)
[18] XU J M, JUN K S, ZHU X, et al. Learning from bullying traces in social media[C]//Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies. Stroudsburg, PA:Association for Computational Linguistics, 2012:656-666.

Regional bullying recognition based on joint hierarchical attentional network and independent recurrent neural network

联合分层注意力网络和独立循环神经网络的地域欺凌识别

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics

[1]	XU Jianglang, LI Linyan, WAN Xinjun, HU Fuyuan. Indoor scene recognition method combined with object detection [J]. Journal of Computer Applications, 2021, 41(9): 2720-2725.
[2]	WANG Wei, ZHAO Erping, CUI Zhiyuan, SUN Hao. Disambiguation method of multi-feature fusion based on HowNet sememe and Word2vec word embedding representation [J]. Journal of Computer Applications, 2021, 41(8): 2193-2198.
[3]	WEN Chaodong, ZENG Cheng, REN Junwei, ZHANG Yan. Patent text classification based on ALBERT and bidirectional gated recurrent unit [J]. Journal of Computer Applications, 2021, 41(2): 407-412.
[4]	YUAN Jingling, DING Yuanyuan, PAN Donghang, LI Lin. Chinese implicit sentiment classification model based on sequence and contextual features [J]. Journal of Computer Applications, 2021, 41(10): 2820-2828.
[5]	HAN Jiandong, LI Xiaoyu. Pedestrian re-identification method based on multi-scale feature fusion [J]. Journal of Computer Applications, 2021, 41(10): 2991-2996.
[6]	ZHANG Xinyi, FENG Shimin, DING Enjie. Entity recognition and relation extraction model for coal mine [J]. Journal of Computer Applications, 2020, 40(8): 2182-2188.
[7]	WANG Yang, ZHAO Hongdong. Human activity recognition based on improved particle swarm optimization-support vector machine and context-awareness [J]. Journal of Computer Applications, 2020, 40(3): 665-671.
[8]	WANG Yue, WANG Mengxuan, ZHANG Sheng, DU Wen. Alarm text named entity recognition based on BERT [J]. Journal of Computer Applications, 2020, 40(2): 535-540.
[9]	FEI Dasheng, SONG Huihui, ZHANG Kaihua. Multi-level feature enhancement for real-time visual tracking [J]. Journal of Computer Applications, 2020, 40(11): 3300-3305.
[10]	ZHAO Xinchen, YANG Nan. Optimizing webcam-based eye tracking system via head pose analysis [J]. Journal of Computer Applications, 2020, 40(11): 3295-3299.
[11]	ZHAO Hong, WANG Le, WANG Weijie. Text sentiment analysis based on serial hybrid model of bi-directional long short-term memory and convolutional neural network [J]. Journal of Computer Applications, 2020, 40(1): 16-22.
[12]	WU Ting, CAO Chunping. Aspect level sentiment classification model with location weight and long-short term memory based on attention-over-attention [J]. Journal of Computer Applications, 2019, 39(8): 2198-2203.
[13]	ZENG Jianping, CHEN Qile, WU Chengrong, FANG Xi. Analysis method of passwords under Chinese context [J]. Journal of Computer Applications, 2019, 39(6): 1713-1718.
[14]	LIU Jing, WU Yingfei, YUAN Zhenming, SUN Xiaoyan. Blood pressure prediction with multi-factor cue long short-term memory model [J]. Journal of Computer Applications, 2019, 39(5): 1551-1556.
[15]	ZHANG Kejun, LI Weinan, QIAN Rong, SHI Taimeng, JIAO Meng. Automatic text summarization scheme based on deep learning [J]. Journal of Computer Applications, 2019, 39(2): 311-315.