Journal of Computer Applications ›› 2017, Vol. 37 ›› Issue (6): 1644-1649.DOI: 10.11772/j.issn.1001-9081.2017.06.1644

Previous Articles     Next Articles

Distributed denial of service attack recognition based on bag of words model

MA Linjin1,2, WAN Liang1,2, MA Shaoju1, YANG Ting1, YI Huifan1   

  1. 1. College of Computer Science and Technology, Guizhou University, Guiyang Guizhou 550025, China;
    2. Institute of Computer Software and Theory, Guizhou University, Guiyang Guizhou 550025, China
  • Received:2016-11-08 Revised:2017-01-06 Online:2017-06-10 Published:2017-06-14
  • Supported by:
    This work is partially supported by the Guizhou Provincial Science Department Project (KEHE LH[2014]7634, KEHE J[2011]2328).


马林进1,2, 万良1,2, 马绍菊1, 杨婷1, 易辉凡1   

  1. 1. 贵州大学 计算机科学与技术学院, 贵阳 550025;
    2. 贵州大学 计算机软件与理论研究所, 贵阳 550025
  • 通讯作者: 万良
  • 作者简介:马林进(1991-),男,浙江台州人,硕士研究生,主要研究方向:信息安全、模式识别、异常流量检测、分布式拒绝服务攻击检测;万良(1974-),男,贵州铜仁人,教授,博士,CCF会员,主要研究方向:信息安全、形式化方法、机器学习、智能家居;马绍菊(1991-),女,贵州毕节人,硕士研究生,CCF会员,主要研究方向:信息安全、App隐私保护;杨婷(1992-),女,贵州毕节人,硕士研究生,主要研究方向:信息安全、智能家居;易辉凡(1993-),男,贵州安顺人,硕士研究生,CCF会员,主要研究方向:信息安全、形式化方法。
  • 基金资助:

Abstract: The payload of Distribute Denial of Service (DDoS) attack changes drastically, the manual intervention of setting warning threshold relies on experience and the signature of abnormal traffic updates not timely, an improved DDoS attack detection algorithm based on Binary Stream Point Bag of Words (BSP-BoW) model was proposed. The Stream Point (SP) was extracted automatically from current network traffic data, the adaptive anomaly detection was carried out for different topology networks, and the labor cost was reduced by decreasing frequently updated feature set. Firstly, the mean clustering was carried out for the existing attack traffic and normal traffic to look for SP in the network traffic. Then, the original traffic was mapped to the corresponding SP for formalized expression by histogram. Finally, the DDoS was detected and classified by Euclidean distance. The experimental results on public database DARPA LLDOS1.0 show that, compared with Locally Weighted Learning (LWL), Support Vector Machine (SVM), Random Tree (RT), Logistic regression analysis (Logistic), Naive Bayes (NB), the proposed algorithm has higher recognition rate of abnormal network traffic. The proposed algorithm based on BoW model has the good recognition effect and generalization ability in abnormal network traffic recognition of denial of service attack, which is suitable for the deployment in the Small Medium Enterprise (SME) network traffic equipment.

Key words: Bag of Words (BoW), machine learning, clustering, Distributed Denial of Service (DDoS) attack, anomaly traffic detection, Stream Point (SP)

摘要: 针对分布式拒绝服务(DDoS)攻击有效荷载快速变化,人工干预需要依赖经验设定预警阈值以及异常流量特征码更新不及时等问题,提出一种基于二进制流量关键点词袋(BSP-BoW)模型的DDoS攻击检测算法。该算法可以自动从当前网络的流量数据中训练得到流量关键点(SP),针对不同拓扑网络进行自适应异常检测,减少频繁更新特征集带来的人工成本。首先,对已有的攻击流量和正常流量进行均值聚类,寻找网络流量中的SP;然后,将原有的流量转化映射到相应SP上使用直方图进行形式化表达;最后,通过欧氏距离进行DDoS攻击的分类检测。在公开数据库DARPA LLDOS1.0上的实验结果表明,所提算法的异常网络流量识别率优于现有的局部加权学习(LWL)、支持向量机(SVM)、随机树(Random Tree)、logistic回归分析(logistic)、贝叶斯(NB)等方法。所提的基于词袋聚类模型算法在拒绝服务攻击的异常流量识别中有很好的识别效果和泛化能力,适合部署在中小企业(SME)网络流量设备上。

关键词: 词袋, 机器学习, 聚类, 分布式拒绝服务攻击, 异常流量识别, 流量关键点

CLC Number: