Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Adversarial attack algorithm for deep learning interpretability
Quan CHEN, Li LI, Yongle CHEN, Yuexing DUAN
Journal of Computer Applications    2022, 42 (2): 510-518.   DOI: 10.11772/j.issn.1001-9081.2021020360
Abstract640)   HTML20)    PDF (1283KB)(425)       Save

Aiming at the problem of model information leakage caused by interpretability in Deep Neural Network (DNN), the feasibility of using the Gradient-weighted Class Activation Mapping (Grad-CAM) interpretation method to generate adversarial samples in a white-box environment was proved, moreover, an untargeted black-box attack algorithm named dynamic genetic algorithm was proposed. In the algorithm, first, the fitness function was improved according to the changing relationship between the interpretation area and the positions of the disturbed pixels. Then, through multiple rounds of genetic algorithm, the disturbance value was continuously reduced while increasing the number of the disturbed pixels, and the set of result coordinates of each round would be maintained and used in the next round of iteration until the perturbed pixel set caused the predicted label to be flipped without exceeding the perturbation boundary. In the experiment part, the average attack success rate under the AlexNet, VGG-19, ResNet-50 and SqueezeNet models of the proposed algorithm was 92.88%, which was increased by 16.53 percentage points compared with that of One pixel algorithm, although with the running time increased by 8% compared with that of One pixel algorithm. In addition, in a shorter running time, the proposed algorithm had the success rate higher than the Adaptive Fast Gradient Sign Method (Ada-FGSM) algorithm by 3.18 percentage points, higher than the Projection & Probability-driven Black-box Attack (PPBA) algorithm by 8.63 percentage points, and not much different from Boundary-attack algorithm. The results show that the dynamic genetic algorithm based on the interpretation method can effectively execute the adversarial attack.

Table and Figures | Reference | Related Articles | Metrics