Journal of Computer Applications
Next Articles
Received:
Revised:
Online:
Published:
Supported by:
王练1,张豪杰2,梅天风1
通讯作者:
基金资助:
Abstract: To address issue that uniform perturbation setting in traditional adversarial training ignores robustness differences among samples, causing imbalancing natural accuracy and robust accuracy of models, this paper proposes a dynamic adversarial training method based on sample robustness disparity to achieve their collaborative optimization, thus providing technical support for defense of deep learning models in Artificial Intelligence security domain. First, proposed method quantifies sample robustness via confidence gap between top-1 and top-2 predicted categories, and customizes perturbations only for correctly classified clean samples. Second, it combines confidence gap with skewness information of multi-category confidence, and constructs a differential perturbation generation mechanism combined with an early stopping strategy. Finally, it adopts Kullback-Leibler divergence to measure sample distribution difference, and designs a dynamically weighted loss function to prioritize learning of samples with vulnerable robustness, realizing sample-level refined training through a dual-module framework. Experimental comparisons with seven mainstream methods on three benchmark datasets (CIFAR-10, CIFAR-100 and Tiny ImageNet) show that proposed method achieves significant improvements in both clean sample accuracy and adversarial robust accuracy, with higher training efficiency and good adaptability to different perturbation budgets. Ablation experiments validate efficacy of each module. Experimental results demonstrate that proposed method effectively overcomes constraints of traditional methods, realizes collaborative optimization of natural accuracy and robust accuracy, and features good strategy universality, which provides a feasible idea for refined design of adversarial training and its application in security-critical fields.
Key words: Deep Neural Network (DNN), adversarial robustness, adversarial training, adversarial examples, adaptive perturbation
摘要: 为解决传统对抗训练统一扰动设定忽视样本鲁棒差异,导致模型自然精度与鲁棒精度难以兼顾的问题,本文提出基于样本鲁棒差异的动态对抗训练方法,以实现二者协同优化,为人工智能安全领域深度学习模型防御提供技术支撑。该方法先以top-1与top-2类别置信度差量化样本鲁棒性,仅对正确分类的干净样本定制扰动;再融合置信度差与多类别置信度偏态信息,结合早停机制构建差异化扰动生成机制;最后引入Kullback-Leibler散度量化样本分布差异,设计动态加权损失函数,优先强化鲁棒性脆弱样本学习,通过双模块实现样本级精细化训练。在CIFAR-10、CIFAR-100、Tiny ImageNet三个基准数据集上对比7种主流方法,所提方法的干净样本精度与对抗鲁棒精度均显著提升,训练效率更高且扰动预算适应性好,消融实验验证了模块有效性。实验结果表明,该方法有效突破传统方法局限,实现自然精度与鲁棒精度协同优化,策略普适性良好,为对抗训练精细化设计及安全关键领域应用提供了可行思路。
关键词: 深度神经网络, 对抗鲁棒性, 对抗训练, 对抗样本, 自适应扰动
王练 张豪杰 梅天风. 基于样本鲁棒差异的动态对抗训练方法[J]. 《计算机应用》唯一官方网站, DOI: 10.11772/j.issn.1001-9081.2026010036.
0 / Recommend
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2026010036