基于模糊核估计和交替Transformer的二维码图像去运动模糊算法

doi:10.11772/j.issn.1001-9081.2023121861

《计算机应用》唯一官方网站 ›› 0, Vol. ›› Issue (): 234-239.DOI: 10.11772/j.issn.1001-9081.2023121861

• 多媒体计算与计算机仿真 • 上一篇下一篇

基于模糊核估计和交替Transformer的二维码图像去运动模糊算法

石彬¹^,², 成苗¹^,²^,³(), 张绍兵¹^,²^,³, 曾尚¹^,²

^1.中国科学院成都计算机应用研究所，成都 610213
^2.中国科学院大学计算机科学与技术学院，北京 100049
^3.深圳市中钞科信金融科技有限公司，广东深圳 518206

收稿日期:2024-01-10 修回日期:2024-04-10 接受日期:2024-04-11 发布日期:2025-01-24 出版日期:2024-12-31
通讯作者: 成苗
作者简介:石彬（1997—），男，四川德阳人，硕士研究生，主要研究方向：人工智能、机器视觉
成苗（1983—），男，四川成都人，高级工程师，硕士，主要研究方向：人工智能、机器视觉
张绍兵（1979—），男，四川成都人，正高级工程师，硕士，主要研究方向：高速图像处理、缺陷检测、深度学习
曾尚（1995—），男，湖北荆门人，博士研究生，主要研究方向：大数据分析、数据挖掘。

Motion blur removal algorithm for QR code images based on blur kernel estimation and alternating Transformer

Bin SHI¹^,², Miao CHENG¹^,²^,³(), Shaobing ZHANG¹^,²^,³, Shang ZENG¹^,²

^1.Chengdu Institute of Computer Application，Chinese Academy of Sciences，Chengdu Sichuan 610213，China
^2.School of Computer Science and Technology，University of Chinese Academy of Sciences，Beijing 100049，China
^3.Shenzhen CBPM-KEXIN Banking Technology Company Limited，Shenzhen Guangdong 518206，China

Received:2024-01-10 Revised:2024-04-10 Accepted:2024-04-11 Online:2025-01-24 Published:2024-12-31
Contact: Miao CHENG

摘要/Abstract

摘要：

在生产生活中，运动模糊的存在增加了二维码（QR code）识别的难度。针对这一问题，提出一种基于模糊核估计和交替Transformer的QR code图像去运动模糊算法。首先，针对目前去运动模糊算法对中间退化过程缺少解释的问题，使用一个模糊核估计网络（KEN）动态估计模糊核的形状和参数，并在将KEN的输出和原图做维纳滤波后，指导后续复原网络更好地去运动模糊；其次，针对基于窗口的Transformer捕捉全局特征的能力较弱，传统Transformer计算复杂度较高的问题，提出一个基于局部窗口的Transformer块（LTB）和基于全局轴的Transformer块（GTB）结合的交替Transformer模块（ATB），交替提取局部和全局的特征；最后，针对输入为单一尺度图像时模型无法处理不同层次模糊的问题，提出一个多尺度特征融合模块（MSFFB）。这样，模型能从多尺度输入图像提取特征，有效利用上下文信息，并更好地保留和恢复图像细节。在运动模糊QR code图像数据集上的实验结果表明，对于线性模糊核测试集，所提算法相较于复原效果第2名的Uformer（U-shaped Transformer）-B在峰值信噪比（PSNR）、结构相似度（SSIM）上分别提升了3.11%、1.23%；对于非线性模糊核测试集，所提算法相较于Uformer-B在PSNR和SSIM上分别提升了7.13%、2.19%，同时，在乘加累积操作量（MAC）上减少了77.22%，在所有对比算法中取得最优，在模型参数量（Param）上下降了83.5%。此外，采用YOLOv4和ZBar进行目标检测和识别实验，结果表明所提算法对提高QR code检测和识别效率具有一定实际意义。

关键词: Transformer, 二维码, 运动模糊图像复原, 模糊核估计, 多尺度

Abstract:

In production and life， the existence of motion blur increases the difficulty of Quick Response code （QR code） recognition. To solve this problem， a motion blur removal algorithm for QR code images based on blur kernel estimation and alternating Transformer was proposed. Firstly， in order to solve the problem that the current motion blur removal algorithms lack explanation of the intermediate degradation process， a blur Kernel Estimation Network （KEN） was used to estimate the shapes and parameters of the blur kernel dynamically， and after performing Wiener filtering on KEN output and the original image， the subsequent restoration networks were guided to better remove motion blur. Then， aiming at the problems that the window-based Transformer has a weak ability to capture global features and the traditional Transformer has high computational complexity， an Alternating Transformer Block （ATB） that combines Local-window Transformer Block （LTB） and Global-axis Transformer Block （GTB） was proposed to extract local and global features alternately. Finally， since when the input is a single-scale image， the model cannot handle with different levels of blur， a Multi-Scale Feature Fusion Block （MSFFB） was proposed. In this way， the model was able to extract features from multi-scale input images， utilize contextual information effectively， and retain and restore image details better. Experimental results on a motion blurred QR code image dataset show that for the linear blur kernel test set， compared with Uformer （U-shaped Transformer）-B， which has the second best restoration effect， the proposed algorithm has better performance in Peak Signal-to-Noise Ratio （PSNR） and Structural SIMilarity （SSIM） with 3.11% and 1.23% improvements respectively； for the nonlinear blur kernel test set， compared with Uformer-B， the proposed algorithm has the PSNR and SSIM indicators increased by 7.13% and 2.19% respectively. At the same time， the Multiply ACcumulate operations （MAC） of the proposed algorithm is decreased by 77.22%， obtaining the best among all comparison algorithms， and the proposed algorithm has a decrease of 83.5% in the model Parameter （Param）. Besides， YOLOv4 and ZBar were used for object detection and recognition experiments， and the results show that the proposed algorithm has certain practical significance for improving the efficiency of QR code detection and recognition.

Key words: Transformer, Quick Response (QR) code, motion blurred image restoration, blur kernel estimation, multi-scale

中图分类号:

TP391.4

石彬, 成苗, 张绍兵, 曾尚. 基于模糊核估计和交替Transformer的二维码图像去运动模糊算法[J]. 计算机应用, 0, (): 234-239.

Bin SHI, Miao CHENG, Shaobing ZHANG, Shang ZENG. Motion blur removal algorithm for QR code images based on blur kernel estimation and alternating Transformer[J]. Journal of Computer Applications, 0, (): 234-239.

图/表 11

图1 算法整体结构

图2 模糊核估计模块

图3 维纳滤波之后的图像

图4 A-MSA结构

图5 MDFF结构

图6 MSFFB结构

图7 运动模糊二维码数据集示例

表1 与现有算法在QRblur数据集上的对比实验结果

去运动模糊算法	线性模糊核		非线性模糊核		MAC/GFLOPs	Param/ $106$
去运动模糊算法	PSNR/dB	SSIM	PSNR	SSIM	MAC/GFLOPs	Param/ $106$
DeblurGAN	26.438 9	0.758 2	24.137 8	0.703 8	419.69	11.00
DeblurGAN-v2	27.568 3	0.843 9	24.570 6	0.791 6	238.47	9.46
MIMO-UNet	28.489 2	0.903 5	25.526 9	0.853 3	602.59	6.81
Uformer-B	29.364 8	0.932 8	25.856 4	0.856 8	771.98	50.39
本文算法	30.278 4	0.944 3	27.701 5	0.875 6	175.79	8.31

表1 与现有算法在QRblur数据集上的对比实验结果

去运动模糊算法	线性模糊核		非线性模糊核		MAC/GFLOPs	Param/ $106$
去运动模糊算法	PSNR/dB	SSIM	PSNR	SSIM	MAC/GFLOPs	Param/ $106$
DeblurGAN	26.438 9	0.758 2	24.137 8	0.703 8	419.69	11.00
DeblurGAN-v2	27.568 3	0.843 9	24.570 6	0.791 6	238.47	9.46
MIMO-UNet	28.489 2	0.903 5	25.526 9	0.853 3	602.59	6.81
Uformer-B	29.364 8	0.932 8	25.856 4	0.856 8	771.98	50.39
本文算法	30.278 4	0.944 3	27.701 5	0.875 6	175.79	8.31

图8 不同方法的主观实验结果对比

表2 消融实验结果

实验序号	PSNR/dB	SSIM	MAC/GFLOPs	Param/ $106$
1	27.062 7	0.880 5	96.16	5.20
2	27.450 5	0.885 1	132.01	6.83
3	27.500 3	0.886 4	114.08	6.01
4	27.610 5	0.887 4	118.01	6.08
5	29.621 9	0.908 1	175.79	8.31

表2 消融实验结果

实验序号	PSNR/dB	SSIM	MAC/GFLOPs	Param/ $106$
1	27.062 7	0.880 5	96.16	5.20
2	27.450 5	0.885 1	132.01	6.83
3	27.500 3	0.886 4	114.08	6.01
4	27.610 5	0.887 4	118.01	6.08
5	29.621 9	0.908 1	175.79	8.31

图9 去运动模糊前后目标检测结果对比

参考文献 23

1	WANG B， XU J， ZHANG J， et al. Motion deblur of QR code based on generative adversative network ［C］// Proceedings of the 2nd International Conference on Algorithms， Computing and Artificial Intelligence. New York： ACM， 2019： 166-170.
2	LI J， ZHANG D， ZHOU M， et al. A motion blur QR code identification algorithm based on feature extracting and improved adaptive thresholding［J］. Neurocomputing， 2022， 493： 351-361.
3	LI J， HU B， CAO Z. A new QR code recognition method using deblurring and modified local adaptive thresholding techniques［C］// Proceedings of the IEEE 16th International Conference on Automation Science and Engineering. Piscataway： IEEE， 2020： 1269-1274.
4	WANG M， CHEN K， LIN F. Multi-residual generative adversarial networks for QR code deblurring ［C］// Proceedings of the SPIE 12254， International Conference on Electronic Information Technology. Bellingham， WA： SPIE， 2022： No.122542H.
5	杨建华，方园园，赵轩. 基于改进维纳滤波算法的运动模糊二维码图像复原方法［J］. 激光杂志， 2024， 45（2）：91-94.
6	LEVIN A. Blind motion deblurring using image statistics ［J］. Advances in Neural Information Processing Systems， 2006， 19：841-848.
7	ZHANG H， YANG J. Scale adaptive blind deblurring ［J］. Advances in Neural Information Processing Systems， 2014， 27：3005-3013.
8	SUN J， CAO W， XU Z， et al. Learning a convolutional neural network for non-uniform motion blur removal ［C］// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2015： 769-777.
9	NAH S， KIM T H， LEE K M. Deep multi-scale convolutional neural network for dynamic scene deblurring ［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 257-265.
10	ZHANG H， DAI Y， LI H， et al. Deep stacked hierarchical multi-patch network for image deblurring ［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2019： 5971-5979.
11	REN D， SHANG W， YANG Y， et al. Aggregating long-term sharp features via hybrid Transformers for video deblurring ［EB/OL］. ［2023-11-13］..
12	KUPYN O， BUDZAN V， MYKHAILYCH M， et al. DeblurGAN： blind motion deblurring using conditional adversarial networks［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 8183-8192.
13	VASWANI A， SHAZEER N， PARMAR N， et al. Attention is all you need ［C］// Proceedings of the 31st International Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2017： 6000-6010.
14	ZHANG J， ZHANG Y， GU J， et al. Accurate image restoration with attention retractable transformer［EB/OL］. ［2023-10-04］..
15	WANG Z， CUN X， BAO J， et al. Uformer： a general U-shaped Transformer for image restoration ［C］// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2022： 17683-17693.
16	HO J， KALCHBRENNER N， WEISSENBORN D， et al. Axial attention in multidimensional Transformers ［EB/OL］. ［2023-12-20］..
17	WANG P， CHEN P， YUAN Y， et al. Understanding convolution for semantic segmentation［C］// Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision. Piscataway： IEEE， 2018： 1451-1460.
18	LAI W S， HUANG J B， AHUJA N， et al. Fast and accurate image super-resolution with deep Laplacian pyramid networks［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2019， 41（11）： 2599-2613.
19	魏海云，郑茜颖，俞金玲. 基于多尺度网络的运动模糊图像复原算法［J］. 计算机应用， 2022， 42（9）：2838-2844.
20	WANG Z， BOVIK A C， SHEIKH H R， et al. Image quality assessment： from error visibility to structural similarity［J］. IEEE Transactions on Image Processing， 2004， 13（4）： 600-612.
21	KUPYN O， MARTYNIUK T， WU J， et al. DeblurGAN-v2： deblurring （orders-of-magnitude） faster and better ［C］// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2019： 8877-8886.
22	CHO S J， JI S W， HONG J P， et al. Rethinking coarse-to-fine approach in single image deblurring ［C］// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2021： 4641-4650.
23	BOCHKOVSKIY A， WANG C Y， LIAO H Y M. YOLOv4： optimal speed and accuracy of object detection［EB/OL］. ［2023-04-22］..

[1]	陈凯, 叶海良, 曹飞龙. 基于局部-全局交互与结构Transformer的点云分类算法[J]. 《计算机应用》唯一官方网站, 2025, 45(5): 1671-1676.
[2]	陈鹏宇, 聂秀山, 李南君, 李拓. 基于时空解耦和区域鲁棒性增强的半监督视频目标分割方法[J]. 《计算机应用》唯一官方网站, 2025, 45(5): 1379-1386.
[3]	李慧, 贾炳志, 王晨曦, 董子宇, 李纪龙, 仲兆满, 陈艳艳. 基于Swin Transformer的生成对抗网络水下图像增强模型[J]. 《计算机应用》唯一官方网站, 2025, 45(5): 1439-1446.
[4]	许鹏程, 何磊, 李川, 钱炜祺, 赵暾. 基于Transformer的深度符号回归方法[J]. 《计算机应用》唯一官方网站, 2025, 45(5): 1455-1463.
[5]	姜坤元, 李小霞, 王利, 曹耀丹, 张晓强, 丁楠, 周颖玥. 引入解耦残差自注意力的边界交叉监督语义分割网络[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1120-1129.
[6]	郭诗月, 党建武, 王阳萍, 雍玖. 结合注意力机制和多尺度特征融合的三维手部姿态估计[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1293-1299.
[7]	令狐鑫瑶, 陈燕, 张鹏程, 刘祎, 桂志国, 赵伟, 董展豪. 基于多尺度引导滤波的宫颈细胞核图像分割[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1333-1339.
[8]	袁宝华, 陈佳璐, 王欢. 融合多尺度语义和双分支并行的医学图像分割网络[J]. 《计算机应用》唯一官方网站, 2025, 45(3): 988-995.
[9]	张众维, 王俊, 刘树东, 王志恒. 多尺度特征融合与加权框融合的遥感图像目标检测[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 633-639.
[10]	王雅伦, 张仰森, 朱思文. 面向知识推理的位置编码标题生成模型[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 345-353.
[11]	刘赏, 周煜炜, 代娆, 董林芳, 刘猛. 融合注意力和上下文信息的遥感图像小目标检测算法[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 292-300.
[12]	宋鹏程, 郭立君, 张荣. 利用局部-全局时间依赖的弱监督视频异常检测[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 240-246.
[13]	梁杰涛, 罗兵, 付兰慧, 常青玲, 李楠楠, 易宁波, 冯其, 何鑫, 邓辅秦. 基于坐标几何采样的点云配准方法[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 214-222.
[14]	贾洁茹, 杨建超, 张硕蕊, 闫涛, 陈斌. 基于自蒸馏视觉Transformer的无监督行人重识别[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2893-2902.
[15]	戎妍, 刘嘉雯, 李馨蕾. 面向学生课堂情感计算的自适应混合网络[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2919-2930.

基于模糊核估计和交替Transformer的二维码图像去运动模糊算法

Motion blur removal algorithm for QR code images based on blur kernel estimation and alternating Transformer

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 11

参考文献 23

相关文章 15

编辑推荐

Metrics