计算机应用 ›› 2010, Vol. 30 ›› Issue (3): 695-698.

• 信息安全 • 上一篇    下一篇

基于集成学习的Self-training在入侵检测中的应用

程仲汉1,臧洌2   

  1. 1. 南京航空航天大学信息科学与技术学院
    2.
  • 收稿日期:2009-09-06 修回日期:2009-10-22 发布日期:2010-03-14 出版日期:2010-03-01
  • 通讯作者: 程仲汉

Application of self-training based on ensemble learning in intrusion detection

  • Received:2009-09-06 Revised:2009-10-22 Online:2010-03-14 Published:2010-03-01

摘要: 针对入侵检测的标记数据难以获得的问题,提出一种基于集成学习的Self-training方法——正则化Self-training。该方法结合主动学习和正则化理论,利用无标记数据对已有的分类器(该分类器对分类模式已学习得很好)作进一步的改进。对三种主要的集成学习方法在不同标记数据比例下进行对比实验,实验结果表明:借助大量无标记数据可以改善组合分类器的分类边界,算法能显著地降低结果分类器的错误率。

关键词: 半监督学习, 集成学习, 入侵检测

Abstract: Regularization self-training is a new method based on ensemble learning. It can solve the problem of insufficient labeled training samples in intrusion detection. The proposed algorithm combined active learning and regularization theory, and utilized unlabeled data to improve the existing classifiers. The experiments were running on three main ensemble learning algorithms under different unlabeled rate. The results prove that the proposed method can improve the boundary of the ensemble classifiers, and reduce the error rate with the help of large amounts of unlabeled data.

Key words: semi-supervised leaning, ensemble learning, instruction detection