《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (11): 3428-3435.DOI: 10.11772/j.issn.1001-9081.2022111677

• 人工智能 • 上一篇    

基于几何关系的跨模型通用扰动生成方法

张济慈, 范纯龙(), 李彩龙, 郑学东   

  1. 沈阳航空航天大学 计算机学院,沈阳 110136
  • 收稿日期:2022-11-11 修回日期:2023-04-06 接受日期:2023-04-11 发布日期:2023-05-08 出版日期:2023-11-10
  • 通讯作者: 范纯龙
  • 作者简介:张济慈(1998—),女,辽宁海城人,硕士研究生,CCF会员,主要研究方向:深度学习、对抗攻击
    范纯龙(1973—),男,辽宁沈阳人,教授,博士,CCF会员,主要研究方向:神经网络可解释性、复杂网络分析、智能系统验证 FanCHL@sau.edu.cn
    李彩龙(1997—),男,江西上饶人,硕士研究生,主要研究方向:深度学习、对抗攻击
    郑学东(1977—),男,黑龙江五常人,教授,博士,主要研究方向:DNA计算、人工智能。
  • 基金资助:
    国家自然科学基金资助项目(61972266)

Cross-model universal perturbation generation method based on geometric relationship

Jici ZHANG, Chunlong FAN(), Cailong LI, Xuedong ZHENG   

  1. School of Computer Science,Shenyang Aerospace University,Shenyang Liaoning 110136,China
  • Received:2022-11-11 Revised:2023-04-06 Accepted:2023-04-11 Online:2023-05-08 Published:2023-11-10
  • Contact: Chunlong FAN
  • About author:ZHANG Jici, born in 1998, M. S. candidate. Her research interests include deep learning, adversarial attack.
    FAN Chunlong, born in 1973, Ph. D., professor. His research interests include neural network interpretability, complex network analysis, intelligent system validation.
    LI Cailong, born in 1997, M. S. candidate. His research interests include deep learning, adversarial attack.
    ZHENG Xuedong, born in 1977, Ph. D., professor. His research interests include DNA computing, artificial intelligence.
  • Supported by:
    National Natural Science Foundation of China(61972266)

摘要:

对抗攻击通过在神经网络模型的输入样本上添加经设计的扰动,使模型高置信度地输出错误结果。对抗攻击研究主要针对单一模型应用场景,对多模型的攻击主要通过跨模型迁移攻击来实现,而关于跨模型通用攻击方法的研究很少。通过分析多模型攻击扰动的几何关系,明确了不同模型间对抗方向的正交性和对抗方向与决策边界间的正交性,并据此设计了跨模型通用攻击算法和相应的优化策略。在CIFAR10、SVHN数据集和六种常见神经网络模型上,对所提算法进行了多角度的跨模型对抗攻击验证。实验结果表明,给定实验场景下的算法攻击成功率为1.0,二范数模长不大于0.9,相较于跨模型迁移攻击,所提算法在六种模型上的平均攻击成功率最多提高57%,并且具有更好的通用性。

关键词: 深度学习, 对抗样本生成, 对抗攻击, 跨模型攻击, 分类器

Abstract:

Adversarial attacks add designed perturbations to the input samples of neural network models to make them output wrong results with high confidence. The research on adversarial attacks mainly aim at the application scenarios of a single model, and the attacks on multiple models are mainly realized through cross-model transfer attacks, but there are few studies on universal cross-model attack methods. By analyzing the geometric relationship of multi-model attack perturbations, the orthogonality of the adversarial directions of different models and the orthogonality of the adversarial direction and the decision boundary of a single model were clarified, and the universal cross-model attack algorithm and corresponding optimization strategy were designed accordingly. On CIFAR10, SVHN datasets and six common neural network models, the proposed algorithm was verified by multi-angle cross-model adversarial attacks. Experimental results show that the attack success rate of the algorithm in a given experimental scenario is 1.0, and the L2-norm is not greater than 0.9. Compared with the cross-model transfer attack, the proposed algorithm has the average attack success rate on the six models increased by up to 57% and has better universality.

Key words: deep learning, adversarial sample generation, adversarial attack, cross-model attack, classifier

中图分类号: