Difference detection method of adversarial samples oriented to deep learning

doi:10.11772/j.issn.1001-9081.2020081282

Abstract

Abstract: Deep Neural Network (DNN) is proved to be vulnerable to adversarial sample attacks in many key deep learning systems such as face recognition and intelligent driving. And the detection of various types of adversarial samples has problems of insufficient detection and low detection efficiency. Therefore, a deep learning model oriented adversarial sample difference detection method was proposed. Firstly, the residual neural network model commonly used in industrial production was constructed as the model of the adversarial sample generation and detection system. Then, multiple kinds of adversarial attacks were used to attack the deep learning model to generate adversarial sample groups. Finally, a sample difference detection system was constructed, containing total 7 adversarial sample difference detection methods in sample confidence detection, perception detection and anti-interference degree detection. Empirical research was carried out by the constructed method on the MNIST and Cifar-10 datasets. The results show that the adversarial samples belonging to different adversarial attacks have obvious differences in the performance detection on confidence, perception and anti-interference degrees, for example, in the detection of confidence and anti-interference, the adversarial samples with excellent performance indicators in perception show significant insufficiencies compared to other types of adversarial samples. At the same time, it is proved that there is consistency of the differences in the two datasets. By using this detection method, the comprehensiveness and diversity of the model's detection of adversarial samples can be effectively improved.

Key words: Deep Neural Network (DNN), adversarial attack, adversarial sample, residual neural network, difference detection

摘要： 深度神经网络（DNN）在许多深度学习关键系统如人脸识别、智能驾驶中被证明容易受到对抗样本攻击，而对多种类对抗样本的检测还存在着检测不充分以及检测效率低的问题，为此，提出一种面向深度学习模型的对抗样本差异性检测方法。首先，构建工业化生产中常用的残差神经网络模型作为对抗样本生成与检测系统的模型；然后，利用多种对抗攻击攻击深度学习模型以产生对抗样本组；最终，构建样本差异性检测系统，包含置信度检测、感知度检测及抗干扰度检测三个子检测系统共7项检测方法。在MNIST与Cifar-10数据集上的实验结果表明，属于不同对抗攻击的对抗样本在置信度、感知度、抗干扰度等各项性能检测上存在明显差异，如感知度各项指标优异的对抗样本在置信度以及抗干扰度的检测中，相较于其他类的对抗样本表现出明显不足；同时，证明了在两个数据集上呈现出差异的一致性。通过运用该检测方法，能有效提升模型对对抗样本检测的全面性与多样性。

关键词: 深度神经网络, 对抗攻击, 对抗样本, 残差神经网络, 差异性检测

CLC Number:

TP181
TP183

WANG Shuyan, HOU Zeyu, SUN Jiaze. Difference detection method of adversarial samples oriented to deep learning[J]. Journal of Computer Applications, 2021, 41(7): 1849-1856.

王曙燕, 侯则昱, 孙家泽. 面向深度学习的对抗样本差异性检测方法[J]. 计算机应用, 2021, 41(7): 1849-1856.

References

[1] LIU C,CAO Y,LUO Y,et al. DeepFood:deep learning-based food image recognition for computer aided dietary assessment[C]//Proceedings of the 2016 International Conference on Smart Homes and Health Telematics, LNCS 9677. Cham:Springer, 2016:37-48.
[2] TANG H,LIU H,XIAO W,et al. When dictionary learning meets deep learning:deep dictionary learning and coding network for image recognition with limited data[J]. IEEE Transactions on Neural Networks and Learning Systems,2020(Early Access):1-13.
[3] PELDSZUS A,STEDE M. Joint prediction in MST-style discourse parsing for argumentation mining[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA:Association for Computational Linguistics, 2015:938-948.
[4] WU R,KAMATA S I. A jointly local structured sparse deep learning network for face recognition[C]//Proceedings of the 2016 IEEE international Conference on Image Processing. Piscataway:IEEE,2016:3026-3030.
[5] KAHN G,VILLAFLOR A,DING B,et al. Self-supervised deep reinforcement learning with generalized computation graphs for robot navigation[C]//Proceedings of the 2018 IEEE International Conference on Robotics and Automation. Piscataway:IEEE,2018:5129-5136.
[6] FENG D,HAASE-SCHUETZ C,ROSENBAUM L,et al. Deep multi-modal object detection and semantic segmentation for autonomous driving:datasets,methods,and challenges[J]. IEEE Transactions on Intelligent Transportation Systems,2021,22(3):1341-1360.
[7] 焦李成, 杨淑媛, 韩军伟. 类脑智能与深度学习的几个问题与思考[J]. 中国科学基金,2019,33(6):646-650.(JIAO L C, YANG S Y,HAN J W. Thoughts and prospects of brain-inspired intelligence[J]. Bulletin of National Natural Science Foundation of China,2019,33(6):646-650.)
[8] WENG T W,ZHANG H,CHEN P Y,et al. Evaluating the robustness of neural networks:an extreme value theory approach[EB/OL].[2019-12-19]. https://arxiv.org/pdf/1801.10578.pdf.
[9] CISSE M,BOJANOWSKI P,GRAVE E,et al. Parseval networks:Improving robustness to adversarial examples[C]//Proceedings of the 34th International Conference on Machine Learning. New York:JMLR. org,2017:854-863.
[10] DONG Y,ZHANG P,WANG J,et al. There is limited correlation between coverage and robustness for deep neural networks[EB/OL].[2020-01-05]. https://arxiv.org/pdf/1911.05904.pdf.
[11] MA L, FELIX J X, ZHANG F, et al. DeepGauge:multigranularity testing criteria for deep learning systems[C]//Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering. New York:ACM, 2018:120-131.
[12] KIM J,FELDT R,YOO S. Guiding deep learning system testing using surprise adequacy[C]//Proceedings of the IEEE/ACM 41st International Conference on Software Engineering. Piscataway:IEEE,2019:1039-1049.
[13] 张思思, 左信, 刘建伟. 深度学习中的对抗样本问题[J]. 计算机学报,2019,42(8):1886-1904.(ZHANG S S,ZUO X,LIU J W. The problem of the adversarial examples in deep learning[J]. Chinese Journal of Computers,2019,42(8):1886-1904.)
[14] 潘文雯, 王新宇, 宋明黎, 等. 对抗样本生成技术综述[J]. 软件学报,2020,31(1):67-81.(PAN W W,WANG X Y,SONG M L,et al. Survey on generating adversarial examples[J]. Journal of Software,2020,31(1):67-81.)
[15] SZEGEDY C,ZAREMBA W,SUTSKEVER I,et al. Intriguing properties of neural networks[EB/OL].[2020-01-08]. http://www.arxiv.org/pdf/1312.6199.pdf.
[16] GOODFELLOW I J,SHLENS J,SZEGEDY C. Explaining and harnessing adversarial examples[EB/OL].[2020-01-11]. https://arxiv.org/pdf/1412.6572.pdf.
[17] PAPERNOT N,McDANIEL P,JHA S,et al. The limitations of deep learning in adversarial settings[C]//Proceedings of the 2016 IEEE European Symposium on Security and Privacy. Piscataway:IEEE,2016:372-387.
[18] CARLINI N,WAGNER D. Towards evaluating the robustness of neural networks[C]//Proceedings of the 2017 IEEE Symposium on Security and Privacy. Piscataway:IEEE,2017:39-57.
[19] PEI K,CAO Y,YANG J,et al. DeepXplore:automated whitebox testing of deep learning systems[C]//Proceedings of the 26th Symposium on Operating Systems Principles. New York:ACM, 2017:1-18.
[20] WANG J,DONG G,SUN J,et al. Adversarial sample detection for deep neural network through model mutation testing[C]//Proceedings of the IEEE/ACM 41st International Conference on Software Engineering. Piscataway:IEEE,2019:1245-1256.
[21] WANG Z,BOVIK A C. SHEIKH H R,et al. Image quality assessment:from error visibility to structural similarity[J]. IEEE Transactions on Image Processing,2004,13(4):600-612.
[22] HAN H,JAIN A K,WANG F,et al. Heterogeneous face attribute estimation:a deep multi-task learning approach[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2018, 40(11):2597-2609.
[23] 李国和, 张腾, 吴卫江, 等. 面向机器学习的训练数据集均衡化方法[J]. 计算机工程与设计,2019,40(3):812-818.(LI G H, ZHANG T,WU W J,et al. Equilibration of training data set for machine learning[J]. Computer Engineering and Design,2019, 40(3):812-818.)