《计算机应用》唯一官方网站 ›› 0, Vol. ›› Issue (): 234-239.DOI: 10.11772/j.issn.1001-9081.2023121861

• 多媒体计算与计算机仿真 • 上一篇    下一篇

基于模糊核估计和交替Transformer的二维码图像去运动模糊算法

石彬1,2, 成苗1,2,3(), 张绍兵1,2,3, 曾尚1,2   

  1. 1.中国科学院 成都计算机应用研究所,成都 610213
    2.中国科学院大学 计算机科学与技术学院,北京 100049
    3.深圳市中钞科信金融科技有限公司,广东 深圳 518206
  • 收稿日期:2024-01-10 修回日期:2024-04-10 接受日期:2024-04-11 发布日期:2025-01-24 出版日期:2024-12-31
  • 通讯作者: 成苗
  • 作者简介:石彬(1997—),男,四川德阳人,硕士研究生,主要研究方向:人工智能、机器视觉
    成苗(1983—),男,四川成都人,高级工程师,硕士,主要研究方向:人工智能、机器视觉
    张绍兵(1979—),男,四川成都人,正高级工程师,硕士,主要研究方向:高速图像处理、缺陷检测、深度学习
    曾尚(1995—),男,湖北荆门人,博士研究生,主要研究方向:大数据分析、数据挖掘。

Motion blur removal algorithm for QR code images based on blur kernel estimation and alternating Transformer

Bin SHI1,2, Miao CHENG1,2,3(), Shaobing ZHANG1,2,3, Shang ZENG1,2   

  1. 1.Chengdu Institute of Computer Application,Chinese Academy of Sciences,Chengdu Sichuan 610213,China
    2.School of Computer Science and Technology,University of Chinese Academy of Sciences,Beijing 100049,China
    3.Shenzhen CBPM-KEXIN Banking Technology Company Limited,Shenzhen Guangdong 518206,China
  • Received:2024-01-10 Revised:2024-04-10 Accepted:2024-04-11 Online:2025-01-24 Published:2024-12-31
  • Contact: Miao CHENG

摘要:

在生产生活中,运动模糊的存在增加了二维码(QR code)识别的难度。针对这一问题,提出一种基于模糊核估计和交替Transformer的QR code图像去运动模糊算法。首先,针对目前去运动模糊算法对中间退化过程缺少解释的问题,使用一个模糊核估计网络(KEN)动态估计模糊核的形状和参数,并在将KEN的输出和原图做维纳滤波后,指导后续复原网络更好地去运动模糊;其次,针对基于窗口的Transformer捕捉全局特征的能力较弱,传统Transformer计算复杂度较高的问题,提出一个基于局部窗口的Transformer块(LTB)和基于全局轴的Transformer块(GTB)结合的交替Transformer模块(ATB),交替提取局部和全局的特征;最后,针对输入为单一尺度图像时模型无法处理不同层次模糊的问题,提出一个多尺度特征融合模块(MSFFB)。这样,模型能从多尺度输入图像提取特征,有效利用上下文信息,并更好地保留和恢复图像细节。在运动模糊QR code图像数据集上的实验结果表明,对于线性模糊核测试集,所提算法相较于复原效果第2名的Uformer(U-shaped Transformer)-B在峰值信噪比(PSNR)、结构相似度(SSIM)上分别提升了3.11%、1.23%;对于非线性模糊核测试集,所提算法相较于Uformer-B在PSNR和SSIM上分别提升了7.13%、2.19%,同时,在乘加累积操作量(MAC)上减少了77.22%,在所有对比算法中取得最优,在模型参数量(Param)上下降了83.5%。此外,采用YOLOv4和ZBar进行目标检测和识别实验,结果表明所提算法对提高QR code检测和识别效率具有一定实际意义。

关键词: Transformer, 二维码, 运动模糊图像复原, 模糊核估计, 多尺度

Abstract:

In production and life, the existence of motion blur increases the difficulty of Quick Response code (QR code) recognition. To solve this problem, a motion blur removal algorithm for QR code images based on blur kernel estimation and alternating Transformer was proposed. Firstly, in order to solve the problem that the current motion blur removal algorithms lack explanation of the intermediate degradation process, a blur Kernel Estimation Network (KEN) was used to estimate the shapes and parameters of the blur kernel dynamically, and after performing Wiener filtering on KEN output and the original image, the subsequent restoration networks were guided to better remove motion blur. Then, aiming at the problems that the window-based Transformer has a weak ability to capture global features and the traditional Transformer has high computational complexity, an Alternating Transformer Block (ATB) that combines Local-window Transformer Block (LTB) and Global-axis Transformer Block (GTB) was proposed to extract local and global features alternately. Finally, since when the input is a single-scale image, the model cannot handle with different levels of blur, a Multi-Scale Feature Fusion Block (MSFFB) was proposed. In this way, the model was able to extract features from multi-scale input images, utilize contextual information effectively, and retain and restore image details better. Experimental results on a motion blurred QR code image dataset show that for the linear blur kernel test set, compared with Uformer (U-shaped Transformer)-B, which has the second best restoration effect, the proposed algorithm has better performance in Peak Signal-to-Noise Ratio (PSNR) and Structural SIMilarity (SSIM) with 3.11% and 1.23% improvements respectively; for the nonlinear blur kernel test set, compared with Uformer-B, the proposed algorithm has the PSNR and SSIM indicators increased by 7.13% and 2.19% respectively. At the same time, the Multiply ACcumulate operations (MAC) of the proposed algorithm is decreased by 77.22%, obtaining the best among all comparison algorithms, and the proposed algorithm has a decrease of 83.5% in the model Parameter (Param). Besides, YOLOv4 and ZBar were used for object detection and recognition experiments, and the results show that the proposed algorithm has certain practical significance for improving the efficiency of QR code detection and recognition.

Key words: Transformer, Quick Response (QR) code, motion blurred image restoration, blur kernel estimation, multi-scale

中图分类号: