《计算机应用》唯一官方网站

• •    下一篇

基于自适应扰动的网络防测绘方法

王诚熠1,陈晋音2,刘欣然1   

  1. 1. 浙江工业大学信息工程学院
    2. 浙江工业大学
  • 收稿日期:2024-12-10 修回日期:2025-03-27 发布日期:2025-04-15 出版日期:2025-04-15
  • 通讯作者: 陈晋音
  • 基金资助:
    国家自然科学基金;国家重点研发计划基金;浙江省自然科学基金;浙江省重点研发计划项目

Network anti-mapping method based on adaptive perturbation

  • Received:2024-12-10 Revised:2025-03-27 Online:2025-04-15 Published:2025-04-15

摘要: 基于深度强化学习(Deep Reinforcement Learning, DRL)的智能化网络测绘方法通过将网络测绘过程建模为马尔可夫决策过程(Markov Decision Process, MDP),利用试错学习的方式训练攻击智能体以寻找网络关键路径,从而获取网络拓扑信息。然而,传统的网络防测绘方法通常基于固定的规则,难以应对DRL智能体在测绘过程中不断变化的行为策略。为此,本文提出了一种基于自适应扰动的网络防测绘方法,简称自适应干扰扰动(Adaptive Interference Perturbation, AIP),旨在抵御智能化网络测绘攻击。该方法首先通过历史流量序列信息预测流量状况,根据其与真实流量数据的差异获取梯度信息,生成对抗扰动并返回到原始流量样本中生成对抗样本;随后,采用融合流量态势-路由状态的特征重构方式,通过迭代方式动态优化稀疏字典,在保留扰动对关键流量特征影响的同时,提升扰动的隐蔽性,实现对流量数据的稀疏变换,以进一步阻碍智能体的决策过程。实验结果表明,本文的方法相对于普通的扰动防御方法如快速梯度符号(Fast Gradient Sign Method, FGSM)防御与随机操纵防御,当网络中流量强度大于25%时,攻击者受到扰动的影响程度逐渐提升,导致网路拓扑中的链路权重变化差异加大,网络时延受到影响。同时,在静态、动态蜜罐部署的防御策略对比实验中,本文提出的方法可持续迷惑攻击者,使攻击者难以发现网络中关键路径,导致其网络时延被控制在一定的范围内,在防御效率与稳定性方面均起到了一定的作用。

关键词: 强化学习, 智能化测绘, 关键路径, 防测绘, 对抗扰动

Abstract: The intelligent network mapping method based on Deep Reinforcement Learning (DRL) was developed to model the network mapping process as a Markov Decision Process (MDP) and train the attacking agent using trial-and-error learning to identify critical network paths and obtain network topology information. However, traditional network anti-mapping methods were usually based on fixed rules, making them ineffective against the dynamic behavioral strategies of DRL agents during the mapping process. To address this, a network anti-mapping method, Adaptive Interference Perturbation (AIP), was proposed to defend against intelligent network mapping attacks. First, the method predicted traffic conditions using historical traffic sequence data and calculated gradient information based on the differences from real traffic data. This gradient was used to generate adversarial perturbations, which were injected back into the original traffic samples to produce adversarial examples. Then, a feature reconstruction approach that combined traffic posture and routing state was adopted. Through an iterative process, the sparse dictionary was dynamically optimized, enabling sparse transformation of traffic data. Finally, this approach retained the impact of perturbations on critical traffic features while enhancing their stealthiness, further hindering the agent's decision-making process. Experimental results show that compared to standard perturbation defense methods such as Fast Gradient Sign Method (FGSM) defense and random manipulation defense, AIP significantly increases the attacker's susceptibility to perturbations when the network traffic intensity exceeds 25%, results in greater differences in the link weights of the network topology, and has a noticeable impact on network delay. Furthermore, compared with static and dynamic honeypot deployment strategies, AIP demonstrates continuous confusion of attackers, making it difficult to identify critical network paths, which ensures network delays remain within a controlled range and achieves certain improvements in defense efficiency and stability.

Key words: reinforcement learning, intelligent mapping, critical path, anti-mapping, adversarial perturbation

中图分类号: