《计算机应用》唯一官方网站 ›› 2025, Vol. 45 ›› Issue (12): 3896-3908.DOI: 10.11772/j.issn.1001-9081.2024121733

• 网络空间安全 • 上一篇    下一篇

基于自适应扰动的网络防测绘方法

王诚熠1, 徐磊2, 陈晋音1,3, 邱洪君4   

  1. 1.浙江工业大学 信息工程学院,杭州 310023
    2.中国人民解放军93196部队,新疆 巴音郭楞 841000
    3.浙江工业大学 网络空间安全研究院,杭州 310023
    4.杭州电子科技大学 网络空间安全学院,杭州 310018
  • 收稿日期:2024-12-10 修回日期:2025-03-27 接受日期:2025-04-01 发布日期:2025-04-15 出版日期:2025-12-10
  • 通讯作者: 徐磊
  • 作者简介:王诚熠(2000—),男,浙江永康人,硕士研究生,主要研究方向:人工智能安全、深度学习、强化学习
    徐磊(1997—),男,内蒙古赤峰人,助理工程师,硕士,主要研究方向:飞行器设计、强化学习
    陈晋音(1982—),女,浙江宁波人,教授,博士,CCF会员,主要研究方向:人工智能安全、图数据挖掘、进化计算
    邱洪君(1982—),女,浙江金华人,讲师,博士,CCF会员,主要研究方向:人工智能安全、工业网络安全。
  • 基金资助:
    国家自然科学基金资助项目(62072406);国家重点研发计划项目(2018AAA0100801);浙江省自然科学基金资助项目(LDQ23F020001)

Cyber anti-mapping method based on adaptive perturbation

Chengyi WANG1, Lei XU2, Jinyin CHEN1,3, Hongjun QIU4   

  1. 1.College of Information Engineering,Zhejiang University of Technology,Hangzhou Zhejiang 310023,China
    2.Unit 93196 of PLA,Bayingolin Xinjiang 841000,China
    3.Institute of Cyberspace Security,Zhejiang University of Technology,Hangzhou Zhejiang 310023,China
    4.School of Cyberspace Security,Hangzhou Dianzi University,Hangzhou Zhejiang 310018,China
  • Received:2024-12-10 Revised:2025-03-27 Accepted:2025-04-01 Online:2025-04-15 Published:2025-12-10
  • Contact: Lei XU
  • About author:WANG Chengyi, born in 2000, M. S. candidate. His research interests include artificial intelligence security, deep learning, reinforcement learning.
    XU Lei, born in 1997, M. S., assistant engineer. His research interests include flight vehicle design, reinforcement learning.
    CHEN Jinyin, born in 1982, Ph. D., professor. Her research interests include artificial intelligence security, graph data mining, evolutionary computation.
    QIU Hongjun, born in 1982, Ph. D., lecturer. Her research interests include artificial intelligence security, industrial network security.
  • Supported by:
    National Natural Science Foundation of China(62072406);National Key Research and Development Program of China(2018AAA0100801);Zhejiang Provincial Natural Science Foundation(LDQ23F020001)

摘要:

基于深度强化学习(DRL)的智能化网络测绘方法将网络测绘过程建模为马尔可夫决策过程(MDP),利用试错学习的方式训练攻击智能体以识别关键网络路径,获取网络拓扑信息。然而,传统的网络防测绘方法通常基于固定的规则,难以应对DRL智能体在测绘过程中不断变化的行为策略。因此,提出一种基于自适应扰动的网络防测绘方法,即AIP (Adaptive Interference Perturbation),旨在抵御智能化网络测绘攻击。首先,通过历史流量序列信息预测流量状况,根据预测的状况与真实流量数据的差异获取梯度信息,且使用梯度信息生成的对抗扰动返回原始流量样本中生成对抗样本;其次,采用融合流量态势-路由状态的特征重构方法通过迭代实现对稀疏字典的动态优化,进而完成对流量数据的稀疏变换;最后,将稀疏化后的对抗流量作为网络拓扑的可观测流量信息,并通过分析测绘智能体在网络拓扑链路权重分配上的变化和网络时延的差异评估AIP方法的防御性能。实验结果表明,与传统的扰动防御方法如快速梯度符号法(FGSM)和随机攻击(RA)相比,当网络中的流量强度大于25%时,AIP对攻击者的干扰效果更显著,从而导致网络拓扑中链路权重的变化幅度加大,并显而易见地影响网络时延;与静态蜜罐部署(SHD)和基于Q-Learning的动态蜜罐部署(DHD-Q)方法相比,根据延迟趋势对比结果,AIP可持续干扰攻击者,使攻击者难以发现网络中的关键路径,从而有效控制网络时延波动,在防御效率与稳定性方面具有更优的表现。

关键词: 强化学习, 智能化测绘, 关键路径, 防测绘, 对抗扰动

Abstract:

The intelligent cyber mapping methods based on Deep Reinforcement Learning (DRL) model the cyber mapping process as a Markov Decision Process (MDP) and train the attacking agents using error-driven learning to identify critical network paths and obtain network topology information. However, traditional cyber anti-mapping methods are usually based on fixed rules, making them difficult to face the dynamic behavioral strategies of DRL agents during the mapping process. Therefore, a cyber anti-mapping method based on adaptive perturbation, named AIP (Adaptive Interference Perturbation), was proposed to defend against intelligent cyber mapping attacks. Firstly, the traffic conditions were predicted by using historical traffic sequence information, the gradient information was calculated according to the differences between the predicted conditions and real traffic data, and the gradient information was used to generate adversarial perturbations, which were injected back into the original traffic samples to produce adversarial examples. Then, a feature reconstruction method combining traffic posture and routing state was adopted to optimize the sparse dictionary dynamically through iteration, thereby realizing sparse transformation of traffic data. Finally, the sparse adversarial traffic was used as the observable traffic information of the network topology, and the defense performance of the AIP method was evaluated by analyzing the changes in the link-weight distribution assigned by the mapping agent and the variations in network latency. Experimental results show that compared to traditional perturbation defense methods such as Fast Gradient Sign Method (FGSM)and Random Attack (RA), AIP increases the attacker’s susceptibility to perturbations significantly when the network traffic intensity exceeds 25%, resulting in greater changing amplitude in the link weights of the network topology and a noticeable impact on network delay. Furthermore, compared with Static Honeypot Deployment (SHD) and Dynamic Honeypot Deployment based on Q-Learning (DHD-Q) methods, according to the comparison of delay trends, AIP demonstrates continuous confusion of attackers, making it difficult to identify critical network paths, which ensures network delays remained within a controlled range and achieves better performance in defense efficiency and stability.

Key words: reinforcement learning, intelligent mapping, critical path, anti-mapping, adversarial perturbation

中图分类号: