《计算机应用》唯一官方网站

• •    下一篇

基于直接引导扩散模型的对抗净化方法

胡岩1,李鹏1,2,成姝燕1   

  1. 1.南京邮电大学 计算机学院、软件学院、网络空间安全学院 2.南京邮电大学 网络安全与可信计算研究所
  • 收稿日期:2025-04-14 修回日期:2025-06-05 发布日期:2025-07-01 出版日期:2025-07-01
  • 通讯作者: 李鹏
  • 作者简介:胡岩(2001—),男,江苏泰州人,硕士研究生,主要研究方向:深度学习、对抗攻击与防御;李鹏(1979—),男,福建长汀人,教授,博士生导师,博士,CCF会员(NO.48573M),主要研究方向:计算机通信网络、无线传感器网络、信息安全;成姝燕(1999—),女,山西临汾人,博士研究生,主要研究方向:深度学习、对抗攻击与防御。
  • 基金资助:
    国家自然科学基金资助项目(62102194);江苏省六大人才高峰高层次人才项目(RJFW-111)

Adversarial purification method based on directly guided diffusion model#br#
#br#

HU Yan 1, LI Peng1,2, CHENG Shuyan1   

  1. 1.School of Computer Science, Nanjing University of Posts and Telecommunication 2.Institute of Network Security and Trusted Computing, Nanjing University of Posts and Telecommunication
  • Received:2025-04-14 Revised:2025-06-05 Online:2025-07-01 Published:2025-07-01
  • About author:HU Yan, born in 2001, M. S. candidate. His research interests include deep learning, adversarial attack and defense. LI Peng, born in 1979, Ph. D., professor. His research interests include computer communication networks, wireless sensor networks, information security. CHENG Shuyan, born in 1999, Ph. D. candidate. Her main research interests include deep learning, adversarial attack and defense.
  • Supported by:
    National Natural Science Foundation of P.R. China (62102194); Six Talent Peaks Project of Jiangsu Province(RJFW-111)

摘要: 深度神经网络(DNN)容易受到对抗扰动的影响,因此攻击者会通过向图像中添加难以察觉的对抗扰动以欺骗DNN。基于扩散模型的对抗净化方法使用扩散模型生成干净样本来防御此类攻击,但扩散模型本身也会受到对抗扰动的影响。因此,提出了新颖的对抗净化方法StraightDiffusion,使用对抗样本直接引导扩散模型的净化过程。首先,探讨了现有方法在使用扩散模型进行对抗净化时存在的关键问题与局限性;其次,提出了一种新的采样方式,在去噪过程中使用两阶段引导方式——头引导和尾引导,即在去噪过程的初期和末期进行引导,其他阶段不使用引导。在CIFAR-10和ImageNet数据集以及3个分类器WideResNet-70-16、WideResNet-28-10、ResNet50上的实验结果表明,StraightDiffusion具有超过基线方法的防御性能,在CIFAR-10和ImageNet数据集上相较于去噪模型用于对抗净化(Diffpure)、净化引导扩散模型(GDMP)等方法取得了最好的标准准确率和鲁棒准确率。实验结果验证了所提方法能够提升净化效果从而提高分类模型面对对抗样本的鲁棒准确率,实现了多攻击场景下的有效防御。

关键词: 对抗扰动, 对抗净化, 扩散模型, 鲁棒准确率, 神经网络, 引导

Abstract: Deep Neural Networks (DNNs) are susceptible to adversarial perturbations, so attackers may deceive DNNs by adding imperceptible adversarial perturbations to the image. The adversarial purification method based on diffusion model uses diffusion model to generate clean samples to defend against such attacks, but the diffusion model itself is also affected by adversarial perturbations. To combat such attacks, diffusion models were used for adversarial purification, where clean samples were generated to restore model robustness. However, diffusion models themselves were also influenced by adversarial perturbations. Therefore, a novel adversarial purification approach named StraightDiffusion was proposed, in which the diffusion process was directly guided by adversarial samples. Firstly, key problems and limitations of existing methods that used diffusion models for adversarial purification were discussed. Secondly, a new sampling method was proposed, which used a two-stage guidance approach in the denoising process — head guidance and tail guidance. Guidance was applied only in the early and late stages of denoising, and not in other stages. Experiments were conducted on CIFAR-10 and ImageNet datasets using three classifiers—WideResNet-70-16, WideResNet-28-10, and ResNet50. The results showed that StraightDiffusion outperformed baseline methods in defense performance. Compared to diffusion models for adversarial purification (Diffpure) and guided diffusion model for purification (GDMP), the best standard and robust accuracy was achieved on both CIFAR-10 and ImageNet datasets. The experimental results verified that the proposed method could improve purification performance, thereby enhancing the robust accuracy of classification models against adversarial samples and achieving effective defense under multiple attack scenarios.

Key words: adversarial perturbation, adversarial purification, diffusion model, robust accuracy, neural network, guidance

中图分类号: