计算机应用 ›› 2019, Vol. 39 ›› Issue (7): 1979-1984.DOI: 10.11772/j.issn.1001-9081.2019010018

• 网络空间安全 • 上一篇    下一篇

高效的半监督多层次入侵检测算法

曹卫东, 许志香   

  1. 中国民航大学 计算机科学与技术学院, 天津 300300
  • 收稿日期:2019-01-07 修回日期:2019-02-27 发布日期:2019-04-15 出版日期:2019-07-10
  • 通讯作者: 许志香
  • 作者简介:曹卫东(1964-),女,天津人,副教授,博士,CCF会员,主要研究方向:民航信息系统处理、网络安全;许志香(1993-),女,山东东营人,硕士研究生,主要研究方向:机载信息系统、网络安全。
  • 基金资助:

    民航安全能力建设项目(AADSA0018);民航局科技创新引领资金专项项目(MHRD20160109)。

Efficient semi-supervised multi-level intrusion detection algorithm

CAO Weidong, XU Zhixiang   

  1. College of Computer Science and Technology, Civil Aviation University of China, Tianjin 300300, China
  • Received:2019-01-07 Revised:2019-02-27 Online:2019-04-15 Published:2019-07-10
  • Supported by:

    This work is partially supported by the Civil Aviation Safety Capacity Building Project (AADSA0018), the Civil Aviation Administration Science and Technology Innovation Guidance Fund (MHRD20160109).

摘要:

针对基于监督学习的入侵检测算法需要的大量有标签数据难以收集,无监督学习算法准确率不高,且对R2L及U2R两类攻击检测率低等问题,提出一种高效的半监督多层次入侵检测算法。首先,利用Kd-tree的索引结构,利用加权密度在高密度样本区选择K-means算法的初始聚类中心;然后,将聚类之后的数据分为三个类簇,将无标签类簇和混合类簇借助Tri-training采用加权投票规则扩充有标签数据集;最后,利用二叉树形结构设计层次化分类模型,在NSL-KDD数据集上进行了实验验证。结果表明半监督多层次入侵检测模型能够在利用少量有标签数据的情况下,对R2L及U2R的检测率分别达到49.38%、81.14%,有效提高R2L及U2R两类攻击的检测率,从而降低系统的漏报率。

关键词: 入侵检测, Kd-tree, Tri-training, 半监督, 多层次

Abstract:

An efficient semi-supervised multi-level intrusion detection algorithm was proposed to solve the problems existing in present intrusion detection algorithms such as difficulty of collecting a lot of tagged data for supervised learning-based algorithms, low accuracy of unsupervised learning-based algorithms and low detection rate on R2L (Remote to Local) and U2L (User to Root) of both types of algorithms. Firstly, according to Kd-tree (K-dimension tree) index structure, weighted density was used to select initial clustering centers of K-means algorithm in high-density sample region. Secondly, the data after clustering were divided into three clusters. Then, weighted voting rule was utilized to expand the labeled dataset by means of Tri-training from the unlabeled clusters and mixed clusters. Finally, a hierarchical classification model with binary tree structure was designed and experimental verification was performed on NSL-KDD dataset. The results show that the semi-supervised multi-level intrusion detection model can effectively improve detection rate of R2L and U2R attacks by using small amount of tagged data, the detection rates of R2L and U2R attacks reach 49.38% and 81.14% respectively, thus reducing the system's false negative rate.

Key words: intrusion detection, Kd-tree, Tri-training, semi-supervised, multi-level

中图分类号: