Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (2): 467-473.DOI: 10.11772/j.issn.1001-9081.2021122068

• Cyber security • Previous Articles    

Poisoning attack toward visual classification model

Jie LIANG, Xiaoyan HAO(), Yongle CHEN   

  1. College of Information and Computer,Taiyuan University of Technology,Jinzhong Shanxi 030600,China
  • Received:2021-12-09 Revised:2022-04-14 Accepted:2022-04-22 Online:2022-05-16 Published:2023-02-10
  • Contact: Xiaoyan HAO
  • About author:LIANG Jie, born in 1996, M. S. candidate. Her research interests include internet of things, information security.
    CHEN Yongle, born in 1983, Ph. D., associate professor. His research interests include internet of things and other new networks, network and system security, information security.
  • Supported by:
    Key Research and Development Plan of Shanxi Province(201903D121121)

面向视觉分类模型的投毒攻击

梁捷, 郝晓燕(), 陈永乐   

  1. 太原理工大学 信息与计算机学院,山西 晋中 030600
  • 通讯作者: 郝晓燕
  • 作者简介:梁捷(1996—),女,山西太原人,硕士研究生,主要研究方向:物联网、信息安全
    陈永乐(1983—),男,山东潍坊人,副教授,博士,主要研究方向:物联网及其他新型网络、网络及系统安全、信息安全。
  • 基金资助:
    山西省重点研发计划项目(201903D121121)

Abstract:

In data poisoning attacks, backdoor attackers manipulate the distribution of training data by inserting the samples with hidden triggers into the training set to make the test samples misclassified so as to change model behavior and reduce model performance. However, the drawback of the existing triggers is the sample independence, that is, no matter what trigger mode is adopted, different poisoned samples contain the same triggers. Therefore, by combining image steganography and Deep Convolutional Generative Adversarial Network (DCGAN), an attack method based on sample was put forward to generate image texture feature maps according to the gray level co-occurrence matrix, embed target label character into the texture feature maps as a trigger by using the image steganography technology, and combine texture feature maps with trigger and clean samples into poisoned samples. Then, a large number of fake pictures with trigger were generated through DCGAN. In the training set samples, the original poisoned samples and the fake pictures generated by DCGAN were mixed together to finally achieve the effect that after the poisoner injecting a small number of poisoned samples, the attack rate was high and the effectiveness, sustainability and concealment of the trigger were ensured. Experimental results show that this method avoids the disadvantages of sample independence and has the model accuracy reached 93.78%. When the proportion of poisoned samples is 30%, data preprocessing, pruning defense and AUROR defense have the least influence on the success rate of attack, and the success rate of attack can reach about 56%.

Key words: visual classification model, poisoning attack, backdoor attack, trigger, image steganography, Deep Convolutional Generative Adversarial Network (DCGAN)

摘要:

数据投毒攻击中的后门攻击方式的攻击者通过将带有隐藏触发器的样本插入训练集中来操纵训练数据的分布,从而使测试样本错误分类以达到改变模型行为和降低模型性能的目的。而现有触发器的弊端是样本无关性,即无论采用什么触发模式,不同有毒样本都包含相同触发器。因此将图像隐写技术与深度卷积对抗网络(DCGAN)结合,提出一种基于样本的攻击方法来根据灰度共生矩阵生成图像纹理特征图,利用图像隐写技术将目标标签字符嵌入纹理特征图中作为触发器,并将带有触发器的纹理特征图和干净样本拼接成中毒样本,再通过DCGAN生成大量带有触发器的假图。在训练集样本中将原中毒样本以及DCGAN生成的假图混合起来,最终达到投毒者注入少量的中毒样本后,在拥有较高的攻击率同时,保证触发器的有效性、可持续性和隐藏性的效果。实验结果表明,该方法避免了样本无关性的弊端,并且模型精确度达到93.78%,在30%的中毒样本比例下,数据预处理、剪枝防御以及AUROR防御方法对攻击成功率的影响达到最小,攻击成功率可达到56%左右。

关键词: 视觉分类模型, 投毒攻击, 后门攻击, 触发器, 图像隐写, 深度卷积对抗网络

CLC Number: