基于集成学习的多类型应用层DDoS攻击检测方法

doi:10.11772/j.issn.1001-9081.2021091653

《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (12): 3775-3784.DOI: 10.11772/j.issn.1001-9081.2021091653

所属专题：网络空间安全

基于集成学习的多类型应用层DDoS攻击检测方法

李颖之(), 李曼, 董平, 周华春

北京交通大学电子信息工程学院，北京 100044

收稿日期:2021-09-22 修回日期:2022-01-14 接受日期:2022-01-28 发布日期:2022-12-21 出版日期:2022-12-10
通讯作者: 李颖之
作者简介:李曼（1997—），女，河南濮阳人，博士研究生，主要研究方向：网络安全、智能通信
董平（1979—），男，北京人，教授，博士，主要研究方向：新一代互联网、智慧车联网、移动互联网
周华春（1965—），男，北京人，教授，博士，主要研究方向：智能通信、移动互联网、网络安全、卫星网络。
基金资助:
国家重点研发计划项目(2018YFA0701604)

Multi‑type application‑layer DDoS attack detection method based on integrated learning

Yingzhi LI(), Man LI, Ping DONG, Huachun ZHOU

College of Electronic Information Engineering，Beijing Jiaotong University，Beijing 100044，China

Received:2021-09-22 Revised:2022-01-14 Accepted:2022-01-28 Online:2022-12-21 Published:2022-12-10
Contact: Yingzhi LI
About author:LI Man， born in 1997， Ph. D. candidate. Her research interests include cyber security， intelligent communication.
DONG Ping， born in 1979， Ph. D.， professor. His research interests include next generation Internet， smart Internet of vehicles，mobile Internet.
ZHOU Huachun， born in 1965， Ph. D.， professor. His research interests include intelligent communication， mobile Internet， network security， satellite network.
Supported by:
National Key Research and Development Program of China(2018YFA0701604)

摘要/Abstract

摘要：

针对应用层分布式拒绝服务（DDoS）攻击类型多、难以同时检测的问题，提出了一种基于集成学习的应用层DDoS攻击检测方法，用于检测多类型的应用层DDoS攻击。首先，数据集生成模块模拟正常和攻击流量，筛选并提取对应的特征信息，并生成表征挑战黑洞（CC）、HTTP Flood、HTTP Post及HTTP Get攻击的47维特征信息；其次，离线训练模块将处理后的有效特征信息输入集成后的Stacking检测模型进行训练，从而得到可检测多类型应用层DDoS攻击的检测模型；最后，在线检测模块通过在线部署检测模型来判断待检测流量的具体流量类型。实验结果显示，与Bagging、Adaboost和XGBoost构建的分类模型相比，Stacking集成模型在准确率方面分别提高了0.18个百分点、0.21个百分点和0.19个百分点，且在最优时间窗口下的恶意流量检测率达到了98%。验证了所提方法对多类型应用层DDoS攻击检测的有效性。

关键词: 多类型, 应用层分布式拒绝服务攻击, 分布式拒绝服务, 机器学习, 集成学习

Abstract:

Aiming at the problem of multiple types of application?layer Distributed Denial of Service （DDoS） attacks， which are difficult to detect simultaneously， an application?layer DDoS attack detection method based on integrated learning was proposed to detect multiple types of application?layer DDoS attacks. Firstly， by using the dataset generation module， the normal and attack traffic was simulated， the corresponding feature information was filtered and extracted， and 47?dimensional feature information characterized Challenge Collapsar （CC）， HTTP Flood， HTTP Post and HTTP Get attacks were generated. Secondly， by using the offline training module， the effective features were processed and input into the integrated Stacking detection model for training， thereby obtaining a detection model that can detect multiple types of application?layer DDoS attacks. Finally， by using the online detection module， the specific traffic type of the traffic to be detected was judged through deploying the detection model online. Experimental results show that compared with the classification models constructed by Bagging，Adaboost and XGBoost，the Stacking integretion model improves the accuracy by 0. 18 percentage points，0. 21 percentage points and 0. 19 percentage points respectively，and has the malicious traffic detection rate reached 98% under the optimal time window. It can be seen that the proposed method has good performance in detecting multi-type application-layer DDoS attacks.

Key words: multi?type, application?layer Distributed Denial of Service (DDoS) attack, Distributed Denial of Service (DDoS), machine learning, integrated learning

中图分类号:

TP393

李颖之, 李曼, 董平, 周华春. 基于集成学习的多类型应用层DDoS攻击检测方法[J]. 计算机应用, 2022, 42(12): 3775-3784.

Yingzhi LI, Man LI, Ping DONG, Huachun ZHOU. Multi‑type application‑layer DDoS attack detection method based on integrated learning[J]. Journal of Computer Applications, 2022, 42(12): 3775-3784.

图/表 17

参考文献 22

1	绿盟科技，中国电信云堤. 2020DDoS攻击态势报告［R/OL］. （2021-01-21）［2022-01-11］.. 10.26524/royal.61
	NSFOCUS. China Telecom DamDDoS. 2020 DDoS attack situation report［R/OL］. （2021-01-21）［2022-01-11］.. 10.26524/royal.61
2	ZHANG B， LIU Z H， DONG S Q. IAP‑based self‑learning real‑ time application layer DDoS detection method on storm platform［C］// Proceedings of the 2019 IEEE International Conference on Parallel and Distributed Processing with Applications， Big Data and Cloud Computing， Sustainable Computing and Communications， Social Computing and Networking. Piscataway： IEEE， 2019： 912-919. 10.1109/ispa-bdcloud-sustaincom-socialcom48970.2019.00133
3	JING X Y， YAN Z， PEDRYCZ W. Security data collection and data analytics in the Internet： a survey［J］. IEEE Communications Surveys and Tutorials， 2019， 21（1）： 586-618. 10.1109/comst.2018.2863942
4	ERHAN D， ANARIM E. Istatistiksel yöntemler ile DDoS saldiri tespiti DDoS detection using statistical methods［C］// Proceedings of the 28th Signal Processing and Communications Applications Conference. Piscataway： IEEE， 2020： 1-4. 10.1109/siu49456.2020.9302487
5	WANG C， MIU T T N， LUO X， et al. SkyShield： a sketch‑ based defense system against application layer DDoS attacks［J］. IEEE Transactions on Information Forensics and Security， 2018， 13（3）： 559-573. 10.1109/tifs.2017.2758754
6	TANG J， CHENG Y， HAO Y， et al. SIP flooding attack detection with a multi‑dimensional sketch design［J］. IEEE Transactions on Dependable and Secure Computing， 2014， 11（6）：582-595. 10.1109/tdsc.2014.2302298
7	张蕾，崔勇，刘静，等. 机器学习在网络空间安全研究中的应用［J］. 计算机学报， 2018， 41（9）：1943-1975. 10.11897/SP.J.1016.2018.01943
	ZHANG L， CUI Y， LIU J， et al. Application of machine learning in cyberspace security research［J］. Chinese Journal of Computers， 2018， 41（9）： 1943-1975. 10.11897/SP.J.1016.2018.01943
8	SHE C Y， WEN W S， ZHENG K S， et al. Application layer DDoS detection by K‑means algorithm［C］// Proceedings of the 4th International Conference on Electrical and Electronics Engineering and Computer Science. Dordrecht： Atlantis Press， 2016： 75-78. 10.2991/iceeecs-16.2016.16
9	JOHNSON SINGH K， THONGAM K， DE T. Entropy‑based application layer DDoS attack detection using artificial neural networks［J］. Entropy， 2016， 18（10）： No.350. 10.3390/e18100350
10	ADI E， BAIG Z， HINGSTON P. Stealthy Denial of Service （DoS） attack modelling and detection for HTTP/2 services［J］. Journal of Network and Computer Applications， 2017， 91：1-13. 10.1016/j.jnca.2017.04.015
11	顾玥，李丹，高凯辉. 基于机器学习和深度学习的网络流量分类研究［J］. 电信科学， 2021， 37（3）： 105-113.
	GU Y， LI D， GAO K H. Research on network traffic classification based on machine learning and deep learning［J］. Telecommunications Science， 2021， 37（3）： 105-113.
12	LOTFOLLAHI M， JAFARI SIAVOSHANI M， SHIRALI HOSSEIN ZADE R， et al. Deep packet： a novel approach for encrypted traffic classification using deep learning［J］. Soft Computing， 2020， 24（3）： 1999-2012. 10.1007/s00500-019-04030-2
13	WANG W， ZHU M， WANG J L， et al. End‑to‑end encrypted traffic classification with one‑dimensional convolution neural networks［C］// Proceedings of the 2017 IEEE International Conference on Intelligence and Security Informatics. Piscataway： IEEE， 2017： 43-48. 10.1109/isi.2017.8004872
14	周志华. 集成学习：基础与算法［M］. 李楠，译. 北京：电子工业出版社， 2020：21-60.
	ZHOU Z H. Ensemble Methods： Foundations and Algorithms［M］. LI N， translated. Beijing： Publishing House of Electronics Industry， 2020：21-60.
15	SHARAFALDIN I， HABIBI LASHKARI A， GHORBANI A A. Toward generating a new intrusion detection dataset and intrusion traffic characterization［C］// Proceedings of the 4th International Conference on Information Systems Security and Privacy. Setúbal： SciTePress， 2018： 108-116. 10.5220/0006639801080116
16	LASHKARI A H. CICFlowMeter［CP/OL］. ［2022-01-12］.. 10.1149/ma2022-0115mtgabs
17	周志华. 机器学习［M］. 北京：清华大学出版社， 2016：247-267.
	ZHOU Z H. Machine Learning［M］. Beijing： Tsinghua University Press， 2016：247-267.
18	PAVLYSHENKO B. Using stacking approaches for machine learning models［C］// Proceedings of the IEEE 2nd International Conference on Data Stream Mining and Processing. Piscataway： IEEE， 2018：255-258. 10.1109/dsmp.2018.8478522
19	BREIMAN L. Random forests［J］. Machine Learning， 2001， 45（1）： 5-32. 10.1023/a:1010933404324
20	CHEN T Q， GUESTRIN C. XGBoost： a scalable tree boosting system［C］// Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York： ACM， 2016： 785-794. 10.1145/2939672.2939785
21	ROSKA T， CHUA L O. The CNN universal machine： an analogic array computer［J］. IEEE Transactions on Circuits and Systems II： Analog and Digital Signal Processing， 1993， 40（3）： 163-173. 10.1109/82.222815
22	王子恒. 基于区块链的海量连接管理架构设计与实现［D］. 北京：北京交通大学， 2021：40-52. 10.53469/jissr.2021.08(12).29
	WANG Z H. Design and implementation of mass connection management architecture based on blockchain［D］. Beijing： Beijing Jiaotong University， 2021： 40-52. 10.53469/jissr.2021.08(12).29

流量模型	RF	XGBoost	ET	LightGBM	CNN	LSTM
CC	0.991 2	0.991 7	0.985 8	0.981 2	0.949 4	0.952 5
HTTP Flood	0.910 7	0.950 4	0.901 7	0.912 4	0.686 3	0.761 0
HTTP Post	0.874 6	0.911 1	0.856 0	0.876 7	0.328 4	0.641 2
HTTP Get	0.824 7	0.873 4	0.811 1	0.837 4	0.775 1	0.651 1
Benign	1.000 0	1.000 0	1.000 0	1.000 0	0.998 6	0.999 2
Other	1.000 0	1.000 0	0.998 0	1.000 0	0.994 3	0.994 0

流量模型	RF	XGBoost	ET	LightGBM	CNN	LSTM
CC	0.991 2	0.991 7	0.985 8	0.981 2	0.949 4	0.952 5
HTTP Flood	0.910 7	0.950 4	0.901 7	0.912 4	0.686 3	0.761 0
HTTP Post	0.874 6	0.911 1	0.856 0	0.876 7	0.328 4	0.641 2
HTTP Get	0.824 7	0.873 4	0.811 1	0.837 4	0.775 1	0.651 1
Benign	1.000 0	1.000 0	1.000 0	1.000 0	0.998 6	0.999 2
Other	1.000 0	1.000 0	0.998 0	1.000 0	0.994 3	0.994 0

预测类别	真实类别
预测类别	0	1
0	真阴性（TN）	假阴性（FN）
1	假阳性（FN）	真阳性（TP）

预测类别	真实类别
预测类别	0	1
0	真阴性（TN）	假阴性（FN）
1	假阳性（FN）	真阳性（TP）

收集时间	源IP	目的IP	流量类型
2021-05-22T15：31：00—15：56：00	23.1.0.1	23.1.1.1	CC
	23.1.0.7	23.1.1.1	HTTP Flood
	23.1.0.8	23.1.1.1	HTTP Post
	23.1.0.9	23.1.1.1	HTTP Get
不间断	23.1.0.20~23.1.0.29	23.1.1.7	Benign
2021-05-22T20：14：00—2021-05-23T16：15：00	23.1.0.1~23.1.0.13	23.1.1.2~23.1.1.6	Other

基于集成学习的多类型应用层DDoS攻击检测方法

Multi‑type application‑layer DDoS attack detection method based on integrated learning

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 17

参考文献 22

相关文章 15

编辑推荐

Metrics

流量类型	流量编号	流量类型所占比例/%	流量类型具体数目
CC	0	7.32	71 591
HTTP Flood	1	1.66	16 200
HTTP Post	2	1.57	15 317
HTTP Get	3	1.69	16 546
Benign	4	38.04	371 830
Other	5	49.72	485 953

特征数量	流量类型	精准率	召回率	F1分数	离线训练时间/s	在线检测时间/s
47	CC	0.991 3	0.988 8	0.990 9	3 354.48	56.83
	HTTP Flood	0.961 8	0.973 0	0.9 674
	HTTP Post	0.872 6	0.901 8	0.887 0
	HTTP Get	0.923 5	0.876 0	0.899 1
	Benign	1.000 0	1.000 0	1.000 0
	Other	1.000 0	1.000 0	1.000 0
78	CC	0.967 6	0.947 6	0.957 5	4 372.74	93.41
	HTTP Flood	0.824 7	0.928 4	0.873 5
	HTTP Post	0.872 7	0.900 7	0.886 5
	HTTP Get	0.918 6	0.871 4	0.894 4
	Benign	1.000 0	1.000 0	1.000 0
	Other	1.000 0	0.994 7	0.997 3

模型	准确率	宏平均精确率	宏平均召回率	宏平均F1分数
Bagging	0.992 5	0.942 3	0.938 0	0.939 7
AdaBoost	0.992 2	0.934 8	0.931 8	0.933 1
XGBoost	0.992 4	0.933 8	0.947 6	0.940 5
Stacking	0.994 3	0.958 2	0.956 6	0.957 4

攻击种类	精准率	召回率	F1分数
Benign	0.999 7	0.999 8	0.999 9
CC	0.988 5	0.984 9	0.986 7
HTTP Flood	0.889 8	0.977 8	0.931 7
HTTP Get	0.929 2	0.877 8	0.902 8
HTTP Post	0.865 3	0.914 4	0.889 2
Other	0.996 8	0.991 5	0.994 1

[1]	陈学斌, 任志强, 张宏扬. 联邦学习中的安全威胁与防御措施综述[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1663-1672.
[2]	姚梓豪, 栗远明, 马自强, 李扬, 魏良根. 基于机器学习的多目标缓存侧信道攻击检测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1862-1871.
[3]	佘维, 李阳, 钟李红, 孔德锋, 田钊. 基于改进实数编码遗传算法的神经网络超参数优化[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 671-676.
[4]	郑毅, 廖存燚, 张天倩, 王骥, 刘守印. 面向城区的基于图去噪的小区级RSRP估计方法[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 855-862.
[5]	李博, 黄建强, 黄东强, 王晓英. 基于异构平台的稀疏矩阵向量乘自适应计算优化[J]. 《计算机应用》唯一官方网站, 2024, 44(12): 3867-3875.
[6]	陈学斌, 屈昌盛. 面向联邦学习的后门攻击与防御综述[J]. 《计算机应用》唯一官方网站, 2024, 44(11): 3459-3469.
[7]	孙仁科, 皇甫志宇, 陈虎, 李仲年, 许新征. 神经架构搜索综述[J]. 《计算机应用》唯一官方网站, 2024, 44(10): 2983-2994.
[8]	柴汶泽, 范菁, 孙书魁, 梁一鸣, 刘竟锋. 深度度量学习综述[J]. 《计算机应用》唯一官方网站, 2024, 44(10): 2995-3010.
[9]	尹春勇, 周永成. 双端聚类的自动调整聚类联邦学习[J]. 《计算机应用》唯一官方网站, 2024, 44(10): 3011-3020.
[10]	龙杰, 谢良, 徐海蛟. 集成的深度强化学习投资组合模型[J]. 《计算机应用》唯一官方网站, 2024, 44(1): 300-310.
[11]	崔昊阳, 张晖, 周雷, 杨春明, 李波, 赵旭剑. 有序规范实数对多相似度K最近邻分类算法[J]. 《计算机应用》唯一官方网站, 2023, 43(9): 2673-2678.
[12]	葛晨洋, 刘勤让, 裴雪, 魏帅, 朱正彬. 软件定义网络中高效协同防御分布式拒绝服务攻击的方案[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2477-2485.
[13]	钟静, 林晨, 盛志伟, 张仕斌. 基于汉明距离的量子K-Means算法[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2493-2498.
[14]	蓝梦婕, 蔡剑平, 孙岚. 非独立同分布数据下的自正则化联邦学习优化方法[J]. 《计算机应用》唯一官方网站, 2023, 43(7): 2073-2081.
[15]	黄晓辉, 杨凯铭, 凌嘉壕. 基于共享注意力的多智能体强化学习订单派送[J]. 《计算机应用》唯一官方网站, 2023, 43(5): 1620-1624.

时间窗口值/min	恶意流量检测率/%
1	97.88
2	98.01
3	97.14

时间窗口值/min	恶意流量检测率/%
1	97.88
2	98.01
3	97.14