动态靶向解毒的后门模型净化方法

doi:10.11772/j.issn.1001-9081.2025060683

《计算机应用》唯一官方网站

• • 下一篇

动态靶向解毒的后门模型净化方法

程欣铭¹,黄荣¹,刘浩²,蒋学芹³

1. 东华大学
2. 东华大学信息科学与技术学院
3. Donghua university

收稿日期:2025-06-23 修回日期:2025-09-17 发布日期:2025-10-15 出版日期:2025-10-15
通讯作者: 黄荣
基金资助:
国家自然科学基金;中央高校基本科研业务费专项资金

Dynamic targeted recovery method for backdoor model purification

Received:2025-06-23 Revised:2025-09-17 Online:2025-10-15 Published:2025-10-15
Supported by:
National Natural Science Foundation of China;Fundamental Research Funds for the Central Universities

摘要/Abstract

摘要： 摘要:深度神经网络（DNN）的后门攻击威胁严重破坏模型决策的可信性，而现有防御方法依赖一次性剪枝或全局微调，易导致模型良性精度下降。针对此问题，提出一种动态靶向解毒的后门模型净化方法。首先，利用前置激活刻画神经元行为，定位神经元行为异常的中毒神经元。在模型净化时靶向解毒，仅微调中毒神经元，避免在净化中引入对干净神经元的扰动，更好地维持模型良性精度。其次，在模型净化过程中通过监控神经元行为，获取神经元对净化的反馈，动态定位中毒神经元。在此过程中，引入禁忌搜索策略排除对净化贡献微小的顽固神经元的干扰，加快模型净化收敛速度。在3个基础数据集上针对BadNets（Backdoored Neural Network）等6种后门攻击，所提方法将攻击成功率（Attack Success Rate, ASR）降至平均0.21%，同时良性精度（Accuracy, ACC）提高0.1~2.9个百分点，优于ABL（Anti-Backdoor Learning）等其他5种防御方法。动态靶向解毒的模型净化方法有效解决了传统方法因一次性剪枝或全局微调导致的模型良性精度下降问题，为提升DNN安全性提供了更可靠的解决方案。

关键词: 后门模型净化, 靶向解毒, 动态定位, 前置激活, 禁忌搜索

Abstract: Abstract: Backdoor attacks on Deep Neural Networks (DNN) severely compromise the trustworthiness of model decisions. Existing defense methods relying on one-time pruning or global fine-tuning often lead to significant degradation in clean accuracy. To address this, a dynamic targeted recovery method for backdoor model purification was proposed. First, pre-activation was utilized to characterize neuron behavior, enabling the localization of poisoned neurons with abnormal activity. During model purification, targeted recovery was implemented by fine-tuning only the located poisoned neurons, avoiding disturbances to clean neurons and maintaining clean accuracy effectively. Second, allowing dynamic localization of poisoned neurons, neuron behavior was monitored to obtain feedback on the purification process. A tabu search strategy was introduced to exclude interference from stubborn neurons with minimal contribution to purification, accelerating convergence. Experiments on 3 benchmark datasets against 6 backdoor attacks including BadNets (Backdoored Neural Network) showed that the proposed method reduced the average ASR (attack success rate) to 0.21% and improved ACC (accuracy) by 0.1–2.9 percentage points, outperforming 5 other defense methods such as ABL (Anti-Backdoor Learning). The dynamic targeted recovery strategy effectively overcomes the clean accuracy degradation caused by traditional one-time pruning or global fine-tuning, providing a more reliable solution for enhancing DNN security.

Key words: Keywords: backdoor model purification, targeted recovery, dynamic localization, pre-activation, tabu search

中图分类号:

TP391

程欣铭黄荣刘浩蒋学芹. 动态靶向解毒的后门模型净化方法[J]. 计算机应用, DOI: 10.11772/j.issn.1001-9081.2025060683.

[1]	周玉清, 韩晓龙. 双循环策略下岸桥与跨运车的联合调度[J]. 《计算机应用》唯一官方网站, 2023, 43(2): 645-653.
[2]	周烁, 仇润鹤, 唐旻俊. 基于禁忌搜索和Q-learning的CR-NOMA系统的功率分配算法[J]. 计算机应用, 2021, 41(7): 2026-2032.
[3]	刘景发, 顾瑶平, 刘文杰. 融合本体和改进禁忌搜索策略的气象灾害主题爬虫方法[J]. 计算机应用, 2020, 40(8): 2255-2261.
[4]	蔡芸, 刘朋青, 熊禾根. 基于量子遗传混合算法的泊位联合调度[J]. 计算机应用, 2020, 40(3): 897-901.
[5]	王守华, 李云柯, 孙希延, 纪元法. 基于低成本接收机的双天线测姿算法[J]. 计算机应用, 2019, 39(8): 2381-2385.
[6]	陈凯, 孙希延, 纪元法, 王守华, 陈紫强. 基于载波相位差分的形变监测高精度定位算法[J]. 计算机应用, 2019, 39(4): 1234-1239.
[7]	周围, 向丹蕾, 郭梦雨. MIMO-GFDM系统中低复杂度动态禁忌搜索检测算法的改进[J]. 计算机应用, 2019, 39(4): 1133-1137.
[8]	李飞龙, 赵春艳, 范如梦. 基于禁忌搜索算法求解随机约束满足问题[J]. 计算机应用, 2019, 39(12): 3584-3589.
[9]	何东东, 李引珍. 多车型绿色车辆路径问题优化模型[J]. 计算机应用, 2018, 38(12): 3618-3624.
[10]	杨小东, 康雁, 柳青, 孙金文. 求解作业车间调度问题的混合帝国主义竞争算法[J]. 计算机应用, 2017, 37(2): 517-522.
[11]	李亚玲, 李毅. 基于可变禁忌长度的优化停机位分配[J]. 计算机应用, 2016, 36(10): 2940-2944.
[12]	董跃华, 刘力. 基于自适应改进粒子群优化的数据离散化算法[J]. 计算机应用, 2016, 36(1): 188-193.
[13]	张爱君秦新强龚春琼. 求解最大割问题的多启动禁忌搜索算法[J]. 计算机应用, 2014, 34(5): 1271-1274.
[14]	冶晓隆兰巨龙郭通. 基于主成分分析禁忌搜索和决策树分类的异常流量检测方法[J]. 计算机应用, 2013, 33(10): 2846-2850.
[15]	赵月胡玉梅. 求解可重入并行机调度的混合禁忌搜索算法[J]. 计算机应用, 2012, 32(09): 2451-2454.

动态靶向解毒的后门模型净化方法

Dynamic targeted recovery method for backdoor model purification

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics