Regional bullying recognition based on joint hierarchical attentional network and independent recurrent neural network
MENG Zhao1, TIAN Shengwei1, YU Long2, WANG Ruijin3
1. School of Software, Xinjiang University, Urumqi Xinjiang 830008, China;; 2. Network Center, Xinjiang University, Urumqi Xinjiang 830046, China;; 3. School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu Sichuan 611731, China
Abstract:In order to improve the utilization efficiency of deep information in text context, based on Hierarchical Attention Network (HAN) and Independent Recurrent Neural Network (IndRNN), a regional bullying semantic recognition model called HACBI (HAN_CNN_BiLSTM_IndRNN) was proposed. Firstly, the manually annotated regional bullying texts were mapped into a low-dimensional vector space by means of word embedding technology. Secondly, the local and global semantic information of bullying texts was extracted by using Convolutional Neural Network (CNN) and Bidirectional Long Short-Term Memory (BiLSTM), and internal structure information of text was captured by HAN. Finally, in order to avoid the loss of text hierarchy information and solve the gradient disappearance problem, IndRNN was introduced to enhance the description ability of model, which achieved the integration of information flow. Experimental results show that the Accuracy (Acc), Precision (P), Recall (R), F1 (F1-Measure) and AUC (Area Under Curve) values are 99.57%, 98.54%, 99.02%, 98.78% and 99.35% respectively of this model, which indicates that the effectiveness provided by HACBI is significantly improved compared to text classification models such as Support Vector Machine (SVM) and CNN.
[1] HU K, WU H, QI K, et al. A domain keyword analysis approach extending term frequency-keyword active index with Google Word2Vec model[J]. Scientometrics, 2018, 114(3):1031-1068. [2] CHEN M, LIU W, YANG Z, et al. Automatic prosodic events detection using a two-stage SVM/CRF sequence classifier with acoustic features[C]//Proceedings of the 2012 Chinese Conference on Pattern Recognition, CCIS 321. Berlin:Springer, 2012:572-578. [3] ASHKTORAB Z, HABER E, GOLBECK J, et al. Beyond cyberbullying:self-disclosure, harm and social support on ASKfm[C]//Proceedings of the 2017 ACM on Web Science Conference. New York:ACM, 2017:3-12. [4] BURNAP P, COLOMBO G, AMERY R, et al. Multi-class machine classification of suicide-related communication on Twitter[J]. Online Social Networks and Media, 2017, 2:32-44. [5] ZHOU Y T, DU Z G, ZHANG D, et al. Retrospective observational study about reducing the false negative rate of the sentinel lymph node biopsy:never underestimate the effect of subjective factors[J]. Medicine, 2017, 96(34):e7787. [6] DADVAR M, TRIESCHNINGG D, de JONG F. Experts and machines against bullies:a hybrid approach to detect cyberbullies[C]//Proceedings of the 27th Canadian Conference on Artificial Intelligence, LNCS 8436. Cham:Springer, 2014:275-281. [7] FIRUZI K, VAKILIAN M, DARABAD V P, et al. A novel method for differentiating and clustering multiple partial discharge sources using S transform and bag of words feature[J]. IEEE Transactions on Dielectrics and Electrical Insulation, 2018, 24(6):3694-3702. [8] COLLIER N, NOBATA C, TSUJⅡ J. Automatic acquisition and classification of terminology using a tagged corpus in the molecular biology domain[J]. Terminology, 2001, 7(2):239-257. [9] DJURIC N, ZHOU J, MORRIS R, et al. Hate speech detection with comment embeddings[C]//Proceedings of the 24th International Conference on World Wide Web. New York:ACM, 2015:29-30. [10] WIJERATNE S, DORAN D, SHETH A, et al. Analyzing the social media footprint of street gangs[C]//ISI 2015:Proceedings of the 2015 IEEE International Conference on Intelligence and Security Informatics. Piscataway, NJ:IEEE, 2015:91-96. [11] GITARI N D, ZUPING Z, DAMIEN H, et al. A lexicon-based approach for hate speech detection[J]. International Journal of Multimedia and Ubiquitous Engineering, 2015, 10(4):215-230. [12] MISHRA M K, KUMAR S, VAISH A, et al. Quantifying degree of cyber bullying using level of information shared and associated trust[C]//Proceedings of the 2015 Annual IEEE India Conference. Piscataway, NJ:IEEE, 2015:1-6. [13] OGUZLAR A. With R programming, comparison of performance of different machine learning algorithms[J]. European Journal of Multidisciplinary Studies, 2018, 3(2):172-172. [14] YANG Z, YANG D, DYER C, et al. Hierarchical attention networks for document classification[C]//Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies. Stroudsburg, PA:Association for Computational Linguistics, 2016:1480-1489. [15] 李洋,董红斌.基于CNN和BiLSTM网络特征融合的文本情感分析[J].计算机应用,2018,38(11):3075-3080. (LI Y, DONG H B. Text sentiment analysis based on feature fusion of convolution neural network and bidirectional long short-term memory network[J]. Journal of Computer Applications, 2018, 38(11):3075-3080.) [16] LI S, LI W, COOK C, et al. Independently recurrent neural network (IndRNN):building a longer and deeper RNN[C]//Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. Washington, DC:IEEE Computer Society, 2018:5457-5466. [17] 王建华,周明强,盛爱萍.现代汉语语境研究[M].杭州:浙江大学出版社,2002:59. (WANG J H, ZHOU M Q, SHENG A P. On the Context of Modern Chinese[M]. Hangzhou:Zhejiang University Press, 2002:59.) [18] XU J M, JUN K S, ZHU X, et al. Learning from bullying traces in social media[C]//Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies. Stroudsburg, PA:Association for Computational Linguistics, 2012:656-666.