计算机应用 ›› 2013, Vol. 33 ›› Issue (09): 2450-2454.DOI: 10.11772/j.issn.1001-9081.2013.09.2450

• 网络与通信 • 上一篇    下一篇

基于规则集划分的多决策树报文分类算法

马腾,陈庶樵,张校辉,田乐   

  1. 国家数字交换系统工程技术研究中心,郑州 450002
  • 收稿日期:2013-03-25 修回日期:2013-05-07 出版日期:2013-09-01 发布日期:2013-10-18
  • 通讯作者: 马腾
  • 作者简介:马腾(1987-),男,陕西西安人,硕士研究生,主要研究方向:报文分类;
    陈庶樵(1973-),男,河南郑州人,教授,博士,主要研究方向:网络安全;
    张校辉(1979-),男,河南洛阳人,讲师,博士,主要研究方向:异常流检测;
    田乐(1987-),男,陕西咸阳人,硕士研究生,主要研究方向:报文分类。
  • 基金资助:

    国家863计划项目;国家973计划项目;国家973计划项目

Multiple decision-tree packet classification algorithm based on rule set partitioning

MA Teng,CHEN Shuqiao,ZHANG Xiaaohui,TIAN Le   

  1. National Digital Switching System Engineering and Technology Research Center, Zhengzhou Henan 450002, China
  • Received:2013-03-25 Revised:2013-05-07 Online:2013-10-18 Published:2013-09-01
  • Contact: MA Teng

摘要: 为克服决策树算法处理高速网络、大容量规则集下的报文分类问题时内存使用量大的弊端,提出一种基于规则集划分的多决策树报文分类算法。在保证规则子集数量可控的前提下,采用启发式算法将规则集划分为有限个规则子集,最大限度分离交叠规则;提出两级级联决策树结构,降低决策树深度以减少规则查找时间。理论分析表明,该算法空间复杂度较传统单决策树算法大幅降低。仿真结果表明,该算法的内存使用量比目前空间性能最好的EffiCuts算法减少了30%,且维度可扩展性更好。

关键词: 报文分类, 规则集划分, 多决策树, 内存使用量, 大容量规则集

Abstract: For solving the problem of decision-tree algorithms' too much memory usage when coping with packet classification under the circumstance of high rate network and large volume rule set, a multiple decision-tree algorithm based on rule set partitioning was proposed in this paper. On the condition of controlling the number of subsets, heuristics were used to partition the rule set into limited number of subsets, in which the overlapping rules had been separated. Cascading decision-tree structure was proposed to lower the depth and reduce search time. The theoretical analysis shows that space complexity has been reduced greatly compared to traditional single decision-tree algorithm. The simulation results demonstrate that the algorithm reduces memory usage about 30% and has better dimension scalability when being compared with EffiCuts, which has the best performance for memory usage so far.

Key words: packet classification, rule set partitioning, multiple decision-tree, memory usage, large volume rule set

中图分类号: