An efficient semi-supervised multi-level intrusion detection algorithm was proposed to solve the problems existing in present intrusion detection algorithms such as difficulty of collecting a lot of tagged data for supervised learning-based algorithms, low accuracy of unsupervised learning-based algorithms and low detection rate on R2L (Remote to Local) and U2L (User to Root) of both types of algorithms. Firstly, according to Kd-tree (K-dimension tree) index structure, weighted density was used to select initial clustering centers of K-means algorithm in high-density sample region. Secondly, the data after clustering were divided into three clusters. Then, weighted voting rule was utilized to expand the labeled dataset by means of Tri-training from the unlabeled clusters and mixed clusters. Finally, a hierarchical classification model with binary tree structure was designed and experimental verification was performed on NSL-KDD dataset. The results show that the semi-supervised multi-level intrusion detection model can effectively improve detection rate of R2L and U2R attacks by using small amount of tagged data, the detection rates of R2L and U2R attacks reach 49.38% and 81.14% respectively, thus reducing the system's false negative rate.
[1] DENNING D E. An intrusion-detection model[J]. IEEE Transactions on Software Engineering, 2006, SE-13(2):222-232.
[2] 孔令智.基于网络异常的入侵检测算法研究[D].北京:北京交通大学,2017:15-16.(KONG L Z. Research on intrusion detection algorithm based on network anomaly[D]. BeiJing:Beijing Jiaotong University, 2017:15-16.)
[3] 沈学利,覃淑娟.基于SMOTE和深度信念网络的异常检测[J].计算机应用,2018,38(7):1941-1945.(SHEN X L, QIN S J. Anomaly detection based on synthetic minority oversampling technique and deep belief network[J]. Journal of Computer Applications, 2018, 38(7):1941-1945.)
[4] YADAV S, SUBRAMANIAN S. Detection of application layer DDoS attack by feature learning using stacked autoencoder[C]//ICCTICT 2016:Proceedings of the 2016 International Conference on Computational Techniques in Information and Communication Technologies. Piscataway, NJ:IEEE, 2016:361-366.
[5] 方圆,李明,王萍,等.基于混合卷积神经网络和循环神经网络的入侵检测模型[J].计算机应用,2018,38(10):2903-2907.(FANG Y, LI M, WANG P, et al. Intrusion detection model based on hybrid convolutional neural network and recurrent neural network[J]. Journal of Computer Applications, 2018, 38(10):2903-2907.)
[6] 高妮,高岭,贺毅岳,等.基于自编码网络特征降维的轻量级入侵检测模型[J].电子学报,2017,45(3):730-739.(GAO N, GAO L, HE Y Y, et al. A lightweight intrusion detection model based on autoencoder network with feature reduction[J]. Acta Electronica Sinica, 2017, 45(3):730-739.)
[7] 贾凡,严妍,张家琪.基于K-means聚类特征消减的网络异常检测[J].清华大学学报(自然科学版),2018,58(2):137-142.(JIA F, YAN Y, ZHANG J Q. K-means based feature reduction for network anomaly detection[J]. Journal of Tsinghua University (Natural Science Edition), 2018, 58(2):137-142.)
[8] PENG K, LEUNG V C M, HUANG Q. Clustering approach based on mini batch Kmeans for intrusion detection system over big data[J]. IEEE Access, 2018, 6(99):11897-11906.
[9] PATHAK V, ANANTHANARAYANA V S. A novel multi-threaded K-means clustering approach for intrusion detection[C]//Proceedings of the 2012 IEEE International Conference on Computer Science and Automation Engineering. Piscataway, NJ:IEEE, 2012:757-760.
[10] FITRIANI S, MANDALA S, MURTI M A. Review of semi-supervised method for intrusion detection system[C]//Proceedings of the 2016 Asia Pacific Conference on Multimedia and Broadcasting. Piscataway, NJ:IEEE, 2016:36-41.
[11] HAWELIYA J, NIGAM B. Network intrusion detection using semi supervised support vector machine[J]. International Journal of Computer Applications, 2014, 85(9):27-31.
[12] KUMAR K M, REDDY A R M. A fast DBSCAN clustering algorithm by accelerating neighbor searching using Groups method[J]. Pattern Recognition, 2016, 58:39-48.
[13] ZHOU Z H, LI M. Tri-training:exploiting unlabeled data using three classifiers[J]. IEEE Transactions on Knowledge and Data Engineering, 2005, 17(11):1529-1541.
[14] 刘开云.基于KD-Tree的KNN沙尘孤立点监测算法的研究与应用[D].开封:河南大学,2018:22-24.(LIU K Y. Research and application of KNN sand-dust isolated point monitoring algorithm based on KD-Tree[D]. Kaifeng:Henan University, 2018:22-24.)
[15] REDMOND S J, HENEGHAN C. A method for initialising the K-means clustering algorithm using kd-trees[J]. Pattern Recognition Letters, 2007, 28(8):965-973.
[16] KANUNGO T, MOUNT D M, NETANYAHU N S, et al. The analysis of a simple K-means clustering algorithm[C]//Proceedings of the Sixteenth Annual Symposium on Computational Geometry. New York:ACM, 2000:100-109.
[17] KUMAR K M, REDDY A R M. An efficient K-means clustering filtering algorithm using density based initial cluster centers[J]. Information Sciences, 2017, 418/419:286-301.
[18] AL-JARRAH O Y, AL-HAMMDI Y, YOO P D, et al. Semi-supervised multi-layered clustering model for intrusion detection[J]. Digital Communications and Networks, 2018, 4(4):277-286.
[19] AHMIM A, DERDOUR M, FERRAG M A. An intrusion detection system based on combining probability predictions of a tree of classifiers[J]. International Journal of Communication Systems, 2018, 31(9):e3457.
[20] TAVALLAEE M, BAGHERI E, LU W, et al. A detailed analysis of the KDD CUP 99 data set[C]//Proceedings of the 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications. Piscataway, NJ:IEEE, 2009:1-6.
[21] ZHANG X F, ZHU P D, TIAN J W, et al. An effective semi-supervised model for intrusion detection using feature selection based LapSVM[C]//CITS 2017:Proceedings of the 2017 International Conference on Computer, Information and Telecommunication Systems. Piscataway, NJ:IEEE, 2017:283-286.
[22] ASHFAQ R A R, WANG X Z, HUANG J Z, et al. Fuzziness based semi-supervised learning approach for intrusion detection system[J]. Information Sciences, 2017, 378:484-497.
[23] CATALTEPE Z, EKMEKÇI U, CATALTEPE T, et al. Online feature selected semi-supervised decision trees for network intrusion detection[C]//NOMS 2016:Proceedings of the 2016 IEEE/IFIP Network Operations and Management Symposium. Piscataway, NJ:IEEE, 2016:1085-1088.