基于自适应攻击强度的对抗训练方法

doi:10.11772/j.issn.1001-9081.2023060854

《计算机应用》唯一官方网站 ›› 2024, Vol. 44 ›› Issue (1): 94-100.DOI: 10.11772/j.issn.1001-9081.2023060854

基于自适应攻击强度的对抗训练方法

陈彤, 位纪伟(), 何仕远, 宋井宽, 杨阳

电子科技大学计算机科学与工程学院，成都 611731

收稿日期:2023-07-01 修回日期:2023-08-24 接受日期:2023-08-28 发布日期:2023-09-14 出版日期:2024-01-10
通讯作者: 位纪伟
作者简介:陈彤（2000—），男，江苏盐城人，硕士研究生，主要研究方向：深度学习、对抗攻击与防御；
何仕远（1995—），男，青海西宁人，博士，主要研究方向：对抗攻击与防御、多媒体检索；
宋井宽（1986—），男，江苏淮安人，教授，博士，CCF专业会员，主要研究方向：大规模多媒体检索、图像/视频分割、图像/视频理解；
杨阳（1983—），男，辽宁大连人，教授，博士，CCF高级会员，主要研究方向：多媒体检索、社交媒体分析、机器学习。
第一联系人：位纪伟（1991—），男，河南项城人，博士，CCF会员，主要研究方向：对抗攻击与防御、度量学习、跨模态检索；
基金资助:
国家自然科学基金资助项目(U20B2063);中国博士后科学基金资助项目(2022M720660)

Adversarial training method with adaptive attack strength

Tong CHEN, Jiwei WEI(), Shiyuan HE, Jingkuan SONG, Yang YANG

School of Computer Science and Engineering，University of Electronic Science and Technology of China，Chengdu Sichuan 611731，China

Received:2023-07-01 Revised:2023-08-24 Accepted:2023-08-28 Online:2023-09-14 Published:2024-01-10
Contact: Jiwei WEI
About author:CHEN Tong， born in 2000， M. S. candidate. His research interests include deep learning， adversarial attack and defense.
HE Shiyuan， born in 1995， Ph. D. His research interests include adversarial attack and defense， multimedia retrieval.
SONG Jingkuan， born in 1986， Ph. D.， professor. His research interests include large-scale multimedia retrieval， image/video segmentation， image/video understanding.
YANG Yang， born in 1983， Ph. D.， professor. His research interests include multimedia retrieval， social media analysis， machine learning.
Supported by:
National Natural Science Foundation of China(U20B2063);China Postdoctoral Science Foundation(2022M720660)

摘要/Abstract

摘要：

深度神经网络（DNN）易受对抗样本攻击的特性引发了人们对人工智能系统安全性和可靠性的重大关切，其中对抗训练是增强对抗鲁棒性的一种有效方式。针对现有方法使用固定的对抗样本生成策略但存在忽视对抗样本生成阶段对对抗训练重要性的问题，提出一种基于自适应攻击强度的对抗训练方法。首先，将干净样本和对抗样本输入模型得到输出；然后，计算干净样本和对抗样本模型输出的差异；最后，衡量该差异与上一时刻差异的变化情况，并自动调整对抗样本强度。对三个基准数据集的全面实验结果表明，相较于基准方法投影梯度下降的对抗训练（PGD-AT），该方法在三个基准数据集的AA（AutoAttack）攻击下鲁棒精度分别提升1.92、1.50和3.35个百分点，且所提出方法在鲁棒性和自然准确率方面优于最先进的防御方法可学习攻击策略的对抗训练（LAS-AT）。此外，从数据增强角度看，该方法可以有效解决对抗训练这种特殊数据增强方式中增广效果随训练进展会不断下降的问题。

关键词: 对抗训练, 对抗样本, 对抗防御, 适应攻击强度, 深度学习, 图像分类, 人工智能安全

Abstract:

The vulnerability of deep neural networks to adversarial attacks has raised significant concerns about the security and reliability of artificial intelligence systems. Adversarial training is an effective approach to enhance adversarial robustness. To address the issue that existing methods adopt fixed adversarial sample generation strategies but neglect the importance of the adversarial sample generation phase for adversarial training， an adversarial training method was proposed based on adaptive attack strength. Firstly， the clean sample and the adversarial sample were input into the model to obtain the output. Then， the difference between the model outputs of the clean sample and the adversarial sample was calculated. Finally， the change of the difference compared with the previous moment was measured to automatically adjust the strength of the adversarial sample. Comprehensive experimental results on three benchmark datasets demonstrate that compared with the baseline method Adversarial Training with Projected Gradient Descent （PGD-AT）， the proposed method improves the robust precision under AA （AutoAttack） attack by 1.92， 1.50 and 3.35 percentage points on three benchmark datasets， respectively， and the proposed method outperforms the state-of-the-art defense method Adversarial Training with Learnable Attack Strategy （LAS-AT） in terms of robustness and natural accuracy. Furthermore， from the perspective of data augmentation， the proposed method can effectively address the problem of diminishing augmentation effect during adversarial training.

Key words: adversarial training, adversarial example, adversarial defense, adaptive attack strength, deep learning, image classification, artificial intelligence security

中图分类号:

TP181

陈彤, 位纪伟, 何仕远, 宋井宽, 杨阳. 基于自适应攻击强度的对抗训练方法[J]. 计算机应用, 2024, 44(1): 94-100.

Tong CHEN, Jiwei WEI, Shiyuan HE, Jingkuan SONG, Yang YANG. Adversarial training method with adaptive attack strength[J]. Journal of Computer Applications, 2024, 44(1): 94-100.

图/表 9

参考文献 25

1	白丽贇，胡学敏，宋昇，等.基于深度级联神经网络的自动驾驶运动规划模型［J］.计算机应用， 2019， 39（10）： 2870-2875.
	BAI L Y， HU X M， SONG S， et al. Motion planning model based on deep cascaded neural network for autonomous driving ［J］. Journal of Computer Applications， 2019， 39（10）： 2870-2875.
2	ZOU Z， CHEN K， SHI Z， et al. Object detection in 20 years： A survey ［J］. Proceedings of the IEEE， 2023， 111（3）： 257-276. 10.1109/jproc.2023.3238524
3	TAN M X， PANG R， LE Q V. EfficientDet： Scalable and efficient object detection ［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 10778-10787. 10.1109/cvpr42600.2020.01079
4	MADRY A， MAKELOV A， SCHMIDT L， et al. Towards deep learning models resistant to adversarial attacks ［EB/OL］. （2019-09-04）［2023-08-10］. . 10.48550/arXiv.1706.06083
5	RICE L， WONG E， KOLTER J Z. Overfitting in adversarially robust deep learning ［C］// Proceedings of the 37th International Conference on Machine Learning. New York： JMLR.org， 2020： 8093-8104.
6	ZHANG H， YU Y， JIAO J， et al. Theoretically principled trade-off between robustness and accuracy ［C］// Proceedings of the 36th International Conference on Machine Learning. New York： JMLR.org， 2019： 7472-7482.
7	CAI Q-Z， LIU C， SONG D. Curriculum adversarial training ［C］// Proceedings of the 27th International Joint Conference on Artificial Intelligence. Red Hook： AAAI Press， 2018： 3740-3747. 10.24963/ijcai.2018/520
8	WANG Y， MA X， BAILEY J， et al. On the convergence and robustness of adversarial training ［EB/OL］. （2022-04-23）［2023-08-06］. .
9	ZHANG J， XU X， HAN B， et al. Attacks which do not kill training make adversarial learning stronger ［C］// Proceedings of the 37th International Conference on Machine Learning. New York： JMLR.org， 2020： 11258-11287.
10	JIA X， ZHANG Y， WU B， et al. LAS-AT： Adversarial training with learnable attack strategy ［C］// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2022： 13388-13398. 10.1109/cvpr52688.2022.01304
11	GOODFELLOW I J， SHLENS J， SZEGEDY C. Explaining and harnessing adversarial examples ［EB/OL］. （2015-03-20）［2023-08-10］. .
12	CARLINI N， WAGNER D. Towards evaluating the robustness of neural networks ［C］// Proceedings of the 2017 IEEE Symposium on Security and Privacy. Piscataway： IEEE， 2017： 39-57. 10.1109/sp.2017.49
13	CROCE F， HEIN M. Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks ［C］// Proceedings of the 37th International Conference on Machine Learning. New York： JMLR.org， 2020： 2206-2216. 10.1007/s11263-019-01213-0
14	ZOU J， PAN Z， QIU J， et al. Improving the transferability of adversarial examples with resized-diverse-inputs， diversity-ensemble and region fitting ［C］// Proceedings of the 2020 European Conference on Computer Vision. Cham： Springer， 2020： 563-579. 10.1007/978-3-030-58542-6_34
15	ILYAS A， ENGSTROM L， ATHALYE A， et al. Black-box adversarial attacks with limited queries and information ［C］// Proceedings of the 35th International Conference on Machine Learning. New York： JMLR.org， 2018： 2137-2146.
16	CUI J， LIU S， WANG L， et al. Learnable boundary guided adversarial training ［C］// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2021： 15701-15710. 10.1109/iccv48922.2021.01543
17	KRIZHEVSKY A， HINTON G. Learning multiple layers of features from tiny images ［EB/OL］. （2009-04-08）［2023-08-06］. . 10.1016/j.tics.2007.09.004
18	DENG J， DONG W， SOCHER R， et al. ImageNet： A large-scale hierarchical image database ［C］// Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2009： 248-255. 10.1109/cvpr.2009.5206848
19	WANG Y， ZOU D， YI J， et al. Improving adversarial robustness requires revisiting misclassified examples ［EB/OL］. （2023-05-06）［2023-08-11］. .
20	ZHANG J， ZHU J， NIU G， et al. Geometry-aware instance-reweighted adversarial training ［EB/OL］. （2021-05-31）［2023-08-11］. .
21	SITAWARIN C， CHAKRABORTY S， WAGNER D. SAT： improving adversarial training via curriculum-based loss smoothing ［C］// Proceedings of the 14th ACM Workshop on Artificial Intelligence and Security. New York： ACM， 2021： 25-36. 10.1145/3474369.3486878
22	ZAGORUYKO S， KOMODAKIS N. Wide residual networks ［EB/OL］. （2017-06-14）［2023-08-11］. . 10.5244/c.30.87
23	HE K， ZHANG X， REN S， et al. Identity mappings in deep residual networks ［C］// Proceedings of the 2016 European Conference on Computer Vision. Cham： Springer， 2016： 630-645. 10.1007/978-3-319-46493-0_38
24	SIMONYAN K， ZISSERMAN A. Very deep convolutional networks for large-scale image recognition ［EB/OL］. （2015-04-10）［2023-08-11］. .
25	HE K M， ZHANG X Y， REN S Q， et al. Deep residual learning for image recognition ［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 770-778. 10.1109/cvpr.2016.90

数据集	方法	干净样本和不同攻击下的对抗鲁棒精度/%
数据集	方法	Clean	PGD-10	PGD-20	PGD-50	C&W	AA
CIFAR-10	PGD-AT	85.17	56.07	55.08	54.88	53.91	51.69
	Trades	85.72	56.75	56.10	55.9	53.87	53.40
	MART	84.17	58.98	58.56	58.06	54.58	51.10
	FAT	87.97	50.31	49.86	48.79	48.65	47.48
	GAIRAT	86.30	60.64	59.54	58.74	45.57	40.30
	LAS-AT	86.23	57.64	56.49	56.12	55.73	53.58
	本文方法	85.82	57.51	56.58	56.07	55.85	53.61
CIFAR-100	PGD-AT	60.89	32.19	31.69	31.45	30.10	27.86
	Trades	58.61	29.20	28.66	28.56	27.05	25.94
	SAT	62.82	28.10	27.17	26.76	27.32	24.57
	LAS-AT	61.80	33.45	32.77	32.54	31.12	29.03
	本文方法	61.70	33.98	33.38	33.10	31.56	29.36

数据集	方法	干净样本和不同攻击下的对抗鲁棒精度/%
数据集	方法	Clean	PGD-10	PGD-20	PGD-50	C&W	AA
CIFAR-10	PGD-AT	85.17	56.07	55.08	54.88	53.91	51.69
	Trades	85.72	56.75	56.10	55.9	53.87	53.40
	MART	84.17	58.98	58.56	58.06	54.58	51.10
	FAT	87.97	50.31	49.86	48.79	48.65	47.48
	GAIRAT	86.30	60.64	59.54	58.74	45.57	40.30
	LAS-AT	86.23	57.64	56.49	56.12	55.73	53.58
	本文方法	85.82	57.51	56.58	56.07	55.85	53.61
CIFAR-100	PGD-AT	60.89	32.19	31.69	31.45	30.10	27.86
	Trades	58.61	29.20	28.66	28.56	27.05	25.94
	SAT	62.82	28.10	27.17	26.76	27.32	24.57
	LAS-AT	61.80	33.45	32.77	32.54	31.12	29.03
	本文方法	61.70	33.98	33.38	33.10	31.56	29.36

方法	干净样本和不同攻击下的对抗鲁棒精度/%
方法	Clean	PGD-50	C&W	AA
PGD-AT	43.98	19.98	17.60	13.78
Trades	39.16	15.74	12.92	12.32
LAS-AT	44.86	22.16	18.54	16.74
本文方法	44.86	23.25	18.67	17.13

方法	干净样本和不同攻击下的对抗鲁棒精度/%
方法	Clean	PGD-50	C&W	AA
PGD-AT	43.98	19.98	17.60	13.78
Trades	39.16	15.74	12.92	12.32
LAS-AT	44.86	22.16	18.54	16.74
本文方法	44.86	23.25	18.67	17.13

方法	干净样本和不同攻击下的对抗鲁棒精度/%
方法	Clean	FGSM	PGD-20	C&W
Madry-AT	87.3	56.10	45.80	46.80
CAT	77.43	57.17	46.06	42.28
DART	85.03	63.53	48.70	47.27
FAT	87.97	65.94	49.86	48.65
LAS-Madry-AT	84.95	67.16	55.61	54.31
本文方法	85.33	67.65	56.09	54.62

基于自适应攻击强度的对抗训练方法

Adversarial training method with adaptive attack strength

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 9

参考文献 25

相关文章 15

编辑推荐

Metrics

阈值	干净样本和不同攻击下的对抗鲁棒精度/%
阈值	Clean	PGD-50	C&W	AA
3×10^-3	43.35	22.53	18.23	16.63
4×10^-3	44.86	23.25	18.67	17.13
5×10^-3	44.06	23.18	18.56	16.95
6×10^-3	43.6	23.13	18.46	16.72
7×10^-3	43.61	22.68	18.1	16.58

Backbone	方法	干净样本和不同攻击下的对抗鲁棒精度/%
Backbone	方法	Clean	PGD-50	C&W	AA
VGG19	PGD-AT	70.86	46.65	44.33	42.92
VGG19	本文方法	78.44	50.31	47.71	45.30
Res18	PGD-AT	82.44	52.76	51.17	49.03
Res18	本文方法	84.12	54.64	52.92	50.55
PARes18	PGD-AT	81.64	50.7	49.03	46.54
PARes18	本文方法	82.93	52.78	50.77	48.79
WRN28-10	PGD-AT	85.48	54.21	53.71	51.25
WRN28-10	本文方法	86.28	55.68	55.45	53.24
WRN34-10	PGD-AT	85.17	54.88	53.91	51.69
WRN34-10	本文方法	85.82	56.07	55.85	53.61

PGD预算I	方法	干净样本和不同攻击下的对抗鲁棒精度/%
PGD预算I	方法	Clean	PGD-50	C&W	AA
6	PGD-AT	43.34	19.11	16.82	13.32
6	本文方法	45.28	22.88	18.16	16.59
8	PGD-AT	43.4	19.49	17.00	13.57
8	本文方法	43.59	22.67	18.44	16.90
10	PGD-AT	43.98	19.98	17.60	13.78
10	本文方法	44.86	23.25	18.67	17.13

[1]	陈豪, 夏振平, 程成, 林李兴, 张博文. 基于Transformer-CNN的轻量级图像超分辨率重建网络[J]. 《计算机应用》唯一官方网站, 2024, 44(1): 292-299.
[2]	张雨宁, 阿布都克力木·阿布力孜, 梅悌胜, 徐春, 麦尔达娜·买买提热依木, 哈里旦木·阿布都克里木, 侯钰涛. 基于自监督特征提取的骨骼X线影像异常检测方法[J]. 《计算机应用》唯一官方网站, 2024, 44(1): 175-181.
[3]	朱俊宏, 赖俊宇, 甘炼强, 陈智勇, 刘华烁, 徐国尧. 结合内卷与卷积算子的视频预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(1): 113-122.
[4]	何子仪, 杨燕, 张熠玲. 深度融合多视图聚类网络[J]. 《计算机应用》唯一官方网站, 2023, 43(9): 2651-2656.
[5]	张涵钰, 李振波, 李蔚然, 杨普. 基于机器视觉的水产养殖计数研究综述[J]. 《计算机应用》唯一官方网站, 2023, 43(9): 2970-2982.
[6]	陈俊韬, 朱子奇. 基于多尺度特征提取与融合的图像复制-粘贴伪造检测[J]. 《计算机应用》唯一官方网站, 2023, 43(9): 2919-2924.
[7]	李校林, 杨松佳. 基于深度学习的多用户毫米波中继网络混合波束赋形[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2511-2516.
[8]	郭祥, 姜文刚, 王宇航. 基于改进Inception-ResNet的加密流量分类方法[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2471-2476.
[9]	崔雨萌, 王靖亚, 刘晓文, 闫尚义, 陶知众. 融合注意力和裁剪机制的通用文本分类模型[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2396-2405.
[10]	张小艳, 段正宇. 基于句级别GAN的跨语言零资源命名实体识别模型[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2406-2411.
[11]	张琨, 杨丰玉, 钟发, 曾广东, 周世健. 基于混合代码表示的源代码脆弱性检测[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2517-2526.
[12]	王一, 谢杰, 程佳, 豆立伟. 基于深度学习的RGB图像目标位姿估计综述[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2546-2555.
[13]	拓雨欣, 薛涛. 融合指针网络与关系嵌入的三元组联合抽取模型[J]. 《计算机应用》唯一官方网站, 2023, 43(7): 2116-2124.
[14]	梁敏, 刘佳艺, 李杰. 融合迭代反馈与注意力机制的图像超分辨重建方法[J]. 《计算机应用》唯一官方网站, 2023, 43(7): 2280-2287.
[15]	叶坤佩, 熊熙, 丁哲. 基于领域融合和时间权重的招工推荐模型[J]. 《计算机应用》唯一官方网站, 2023, 43(7): 2133-2139.