Transformer image dehazing based on component collaborative optimization pruning

doi:10.11772/j.issn.1001-9081.2025040395

Abstract

Abstract:

Image dehazing algorithms based on Transformer achieve good dehazing effects， but there are problems such as large number of network parameters and low dehazing speed. In order to prune redundant parts of the dehazing network directionally and shorten dehazing time without affecting dehazing quality， a Transformer image dehazing method based on component collaborative optimization pruning， CCOP-IDT （Component Collaborative Optimization Pruning Image Dehazing Transformer）， was proposed. Firstly， a 5-level Transformer was used to construct a pre-training model of dehazing network. Then， the network pruning was modeled as an optimization problem， Fisher information was used to evaluate the importance of weight parameters， and Hessian matrix was used to measure the joint influence of pruning components on network output， so as to establish a collaborative optimization method for multiple pruning components. Finally， an evolutionary algorithm was employed to solve the optimal pruning rate sequence， so as to obtain the optimal sub-network of the pre-trained model. Experimental results show that after pruning， the number of network parameters is controlled to 0.476×10⁶， which is reduced by 28.8% compared with that before pruning， and the dehazing time is shortened by 25.0%. On the synthetic hazy dataset RESIDE-6K， the proposed method has the Peak Signal-to-Noise Ratio （PSNR） reached 29.60 dB， and the Structural SIMilarity （SSIM） reached 0.968 7， which are reduced by only 1.63% and 0.46% compared with those before pruning， respectively. It can be seen that in terms of both quantitative and qualitative evaluation， the proposed method performs well with great reduction of the model parameters and improvement of the image dehazing speed while maintaining the quantitative indices and subjective perception basically.

Key words: image dehazing, Transformer, model pruning, Fisher information, Hessian matrix, evolutionary algorithm

摘要：

基于Transformer的图像去雾算法取得了较好去雾效果，但存在网络参数量大和去雾速度慢的问题。为实现对去雾网络冗余部分的定向修剪，在不影响去雾质量的前提下缩短去雾时间，提出一种基于组件协同优化剪枝的Transformer图像去雾方法CCOP-IDT（Component Collaborative Optimization Pruning Image Dehazing Transformer）。首先，采用5级Transformer构建去雾网络预训练模型；其次，将网络剪枝建模为优化问题，使用费雪信息评估权重参数重要度，并利用黑塞矩阵衡量剪枝组件对网络输出的联合影响，从而建立多种剪枝组件的协同优化方法；最后，采用进化算法求解最优剪枝率序列，从而得到预训练模型的最优子网络。实验结果表明，剪枝后的网络参数量控制在0.476×10⁶，相较于剪枝前减少了28.8%，去雾时间缩短了25.0%；在合成有雾数据集RESIDE-6K上，所提方法的峰值信噪比（PSNR）达到29.60 dB，结构相似度（SSIM）达到0.968 7，与剪枝前相比仅分别降低了1.63%和0.46%。可见，在定量和定性评估上，所提方法都表现良好，能够在基本保持量化指标和主观观感的前提下，大幅减少模型参数量，提高图像去雾速度。

关键词: 图像去雾, Transformer, 模型剪枝, 费雪信息, 黑塞矩阵, 进化算法

CLC Number:

TP391.41

Jixin GUO, Ting ZHANG. Transformer image dehazing based on component collaborative optimization pruning[J]. Journal of Computer Applications, 2026, 46(3): 933-939.

郭纪新, 张婷. 基于组件协同优化剪枝的Transformer图像去雾[J]. 《计算机应用》唯一官方网站, 2026, 46(3): 933-939.

Figures/Tables 9

References 26

[1]	NARASIMHAN S G， NAYAR S K. Vision and the atmosphere［J］. International Journal of Computer Vision， 2002， 48（3）： 233-254.
[2]	VASWANI A， SHAZEER N， PARMAR N， et al. Attention is all you need ［C］// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2017： 6000-6010.
[3]	ZHENG C， LI Z， ZHANG K， et al. SAViT： structure-aware vision Transformer pruning via collaborative optimization ［C］// Proceedings of the 36th International Conference on Neural Information Processing Systems. Red Hook： Curran Associates Inc.， 2022： 9010-9023.
[4]	RAHMAN Z， JOBSON D J， WOODELL G A. Multi-scale retinex for color image enhancement ［C］// Proceedings of the 3rd IEEE International Conference on Image Processing — Volume 3. Piscataway： IEEE， 1996： 1003-1006.
[5]	HE K， SUN J， TANG X. Single image haze removal using dark channel prior ［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2011， 33（12）： 2341-2353.
[6]	CAI B， XU X， JIA K， et al. DehazeNet： an end-to-end system for single image haze removal ［J］. IEEE Transactions on Image Processing， 2016， 25（11）： 5187-5198.
[7]	QIN X， WANG Z， BAI Y， et al. FFA-Net： feature fusion attention network for single image dehazing ［C］// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto： AAAI Press， 2020： 11908-11915.
[8]	ZHAO S， ZHANG L， SHEN Y， et al. RefineDNet： a weakly supervised refinement framework for single image dehazing ［J］. IEEE Transactions on Image Processing， 2021， 30： 3391-3404.
[9]	张婷，赵杏，陈文欣. 基于条件生成对抗网络的图像去雾方法［J］. 计算机应用， 2021， 41（S2）： 248-253.
	ZHANG T， ZHAO X， CHEN W X. Image dehazing method based on conditional generative adversarial network ［J］. Journal of Computer Applications， 2021， 41（S2）： 248-253.
[10]	SONG Y， HE Z， QIAN H， et al. Vision Transformers for single image dehazing ［J］. IEEE Transactions on Image Processing， 2023， 32： 1927-1941.
[11]	QIU Y， ZHANG K， WANG C， et al. MB-TaylorFormer： multi-branch efficient Transformer expanded by Taylor formula for image dehazing ［C］// Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2023： 12756-12767.
[12]	LIU Y， LIU H， LI L， et al. A data-centric solution to nonhomogeneous dehazing via Vision Transformer ［C］// Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Piscataway： IEEE， 2023： 1406-1415.
[13]	GUO C， YAN Q， ANWAR S， et al. Image dehazing Transformer with transmission-aware 3D position embedding ［C］// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2022： 5802-5810.
[14]	LI X， HUA Z， LI J. Two-stage single image dehazing network using Swin-Transformer ［J］. IET Image Processing， 2022， 16（9）： 2518-2534.
[15]	杨赟辉，程虎，魏敬和，等. 面向Transformer模型边缘端部署的常用激活函数高精度轻量级量化推理方法［J］. 电子学报， 2024， 52（10）： 3301-3311.
	YANG Y H， CHENG H， WEI J H， et al. High-precision lightweight quantization inference method for prevalent activation functions in Transformer models in edge device deployment ［J］. Acta Electronica Sinica， 2024， 52（10）： 3301-3311.
[16]	邱淼波，高晋，林述波，等. 线性分解注意力的边缘端高效Transformer跟踪［J］. 中国图象图形学报， 2025， 30（2）： 485-502.
	QIU M B， GAO J， LIN S B， et al. Efficient Transformer tracking for the edge end with linearly decomposed attention ［J］. Journal of Image and Graphics， 2025， 30（2）： 485-502.
[17]	LeCUN Y， DENKER J S， SOLLA S A. Optimal brain damage［C］// Proceedings of the 1989 Advances in Neural Information Processing Systems. Cambridge： MIT Press， 1989： 598-605.
[18]	TANG Y， HAN K， WANG Y， et al. Patch slimming for efficient Vision Transformers ［C］// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2022： 12155-12164.
[19]	YU F， HUANG K， WANG M， et al. Width & depth pruning for Vision Transformers ［C］// Proceedings of the 36th AAAI Conference on Artificial Intelligence. Palo Alto： AAAI Press， 2022： 3143-3151.
[20]	YANG H， YIN H， SHEN M， et al. Global Vision Transformer pruning with hessian-aware saliency ［C］// Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2023： 18547-18557.
[21]	WANG H， DEDHIA B， JHA N K. Zero-TPrune： zero-shot token pruning through leveraging of the attention graph in pre-trained Transformers ［C］// Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2024： 16070-16079.
[22]	CAO J， YE P， LI S， et al. MADTP： multimodal alignment-guided dynamic token pruning for accelerating vision-language Transformer ［C］// Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2024： 15710-15719.
[23]	GUO J， ZHANG T. Transformer-based image dehazing with accurate color restoration ［C］// Proceedings of the 13th International Conference on Computing and Pattern Recognition. New York： ACM， 2025： 265-271.
[24]	LIU L， ZHANG S， KUANG Z， et al. Group Fisher pruning for practical network compression ［C］// Proceedings of the 38th International Conference on Machine Learning. New York： JMLR.org， 2021： 7021-7032.
[25]	LI B， REN W， FU D， et al. Benchmarking single-image dehazing and beyond ［J］. IEEE Transactions on Image Processing， 2019， 28（1）： 492-505.
[26]	ZHAO S， ZHANG L， HUANG S， et al. Dehazing evaluation： real-world benchmark datasets， criteria， and baselines ［J］. IEEE Transactions on Image Processing， 2020， 29： 6947-6962.

模块	深度	注意头数	输出特征向量维度	参数量/10⁶
B₁	4	2	［24，256，256］	0.019
B₂	4	4	［48，128，128］	0.112
B₃	4	6	［96，64，64］	0.446
B₄	2	1	［48，128，128］	0.031
B₅	2	1	［24，256，256］	0.008

模块	深度	注意头数	输出特征向量维度	参数量/10⁶
B₁	4	2	［24，256，256］	0.019
B₂	4	4	［48，128，128］	0.112
B₃	4	6	［96，64，64］	0.446
B₄	2	1	［48，128，128］	0.031
B₅	2	1	［24，256，256］	0.008

项目	属性
CPU	Intel Core i7-7700K
CPU基准频率	4.20 GHz
GPU	NVIDIA GeForce RTX 2080Ti
显存	11 GB
CUDA	11.6
Python版本	3.8
深度学习框架	PyTorch 1.12.1
batch size	4
epoch	300
学习率	4×10^-4
优化器	Adam

项目	属性
CPU	Intel Core i7-7700K
CPU基准频率	4.20 GHz
GPU	NVIDIA GeForce RTX 2080Ti
显存	11 GB
CUDA	11.6
Python版本	3.8
深度学习框架	PyTorch 1.12.1
batch size	4
epoch	300
学习率	4×10^-4
优化器	Adam

方法	去雾指标		效率指标
方法	PSNR/dB	SSIM	参数量/10⁶	平均耗时/s	计算量/GFLOPs
FFA-Net^［7］	21.98	0.892 9	4.456	0.891	575.60
RefineDNet^［8］	20.62	0.841 9	65.795	0.303	27.82
DehazeFormer^［10］	29.99	0.971 7	0.686	0.139	13.32
本文方法-剪枝前	30.09	0.973 2	0.669	0.132	12.56
本文方法-剪枝后	29.60	0.968 7	0.476	0.099	5.24