Journal of Computer Applications ›› 2018, Vol. 38 ›› Issue (7): 1941-1945.DOI: 10.11772/j.issn.1001-9081.2018010178

Previous Articles     Next Articles

Anomaly detection based on synthetic minority oversampling technique and deep belief network

SHEN Xueli, QIN Shujuan   

  1. School of Electronics and Information Engineering, Liaoning Technical University, Huludao Liaoning 125105, China
  • Received:2018-01-21 Revised:2018-03-27 Online:2018-07-10 Published:2018-07-12
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (61602227).

基于SMOTE和深度信念网络的异常检测

沈学利, 覃淑娟   

  1. 辽宁工程技术大学 电子与信息工程学院, 辽宁 葫芦岛 125105
  • 通讯作者: 覃淑娟
  • 作者简介:沈学利(1969-),男,江苏连云港人,教授,硕士,主要研究方向:信息安全、网络安全;覃淑娟(1993-),女,新疆塔城人,硕士研究生,主要研究方向:网络安全。
  • 基金资助:
    国家自然科学基金资助项目(61602227)。

Abstract: To solve low detection rate problem of intrusion for a small number of samples in mass unbalanced datasets, an anomaly detection based on Synthetic Minority Oversampling Technique (SMOTE) and Deep Belief Network (DBN), called SMOTE-DBN method, was proposed. Firstly, SMOTE technology was used to increase the number of samples in minority categories. Secondly, on the preprocessed balanced data set, the dimensionality of the preprocessed high-dimensional data was reduced by unsupervised Restricted Boltzmann Machine (RBM). Thirdly, the model parameters were finely tuned by Back Propagation (BP) algorithm to obtain the lower low-dimensional representation of the preprocessed data. Finally, softmax classifier was used to classify the optimal low-dimensional data. The simulation experiment results on KDD1999 show that, compared with DBN method and Support Vector Machine (SVM) method, the detection rate of SMOTE-DBN method is increased by 3.31 and 7.34 percentage points respectively, and the false alarm rate is decreased by 1.11 and 2.67 percentage points respectively.

Key words: Synthetic Minority Oversampling Technique (SMOTE), Deep Belief Network (DBN), Restricted Boltzmann Machine (RBM), Logistic Regression (LR), intrusion detection

摘要: 针对现有海量非平衡数据集中少数类别样本入侵检测率低的问题,提出了一种基于合成少数类过采样技术(SMOTE)和深度信念网络(DBN)的异常检测(SMOTE-DBN)方法。首先,用SMOTE技术增加了少数类别样本的样本数;然后在预处理后的较平衡数据集上,用非监督的受限玻尔兹曼机(RBM)对预处理后的高维数据进行特征降维;其次,用反向传播(BP)算法微调模型参数,获得预处理后数据的较优低维表示;最后通过softmax分类器对较优低维数据进行分类。KDD1999数据集仿真实验表明,SMOTE优化处理能够提高模型对少数类别样本的检测率,在相同数据集上,SMOTE-DBN方法与DBN方法、支持向量机(SVM)方法相比,检测率分别提高了3.31个百分点和7.34个百分点,误报率分别降低了1.11个百分点和2.67个百分点。

关键词: 合成少数类过采样技术, 深度信念网络, 受限玻尔兹曼机, 逻辑回归, 入侵检测

CLC Number: