Journal of Computer Applications ›› 2024, Vol. 44 ›› Issue (1): 292-299.DOI: 10.11772/j.issn.1001-9081.2023010048

• Multimedia computing and computer simulation • Previous Articles    

Lightweight image super-resolution reconstruction network based on Transformer-CNN

Hao CHEN1, Zhenping XIA1(), Cheng CHENG1, Xing LIN-LI2, Bowen ZHANG1   

  1. 1.School of Electronic and Information Engineering,Suzhou University of Science and Technology,Suzhou Jiangsu 215009,China
    2.School of Physical Science and Technology,Suzhou University of Science and Technology,Suzhou Jiangsu 215009,China
  • Received:2023-01-17 Revised:2023-04-12 Accepted:2023-04-13 Online:2023-06-06 Published:2024-01-10
  • Contact: Zhenping XIA
  • About author:CHEN Hao, born in 2000, M. S. candidate. His research interests include deep learning, digital image processing.
    CHENG Cheng, born in 1980, Ph. D., lecturer. His research interests include machine learning, artificial intelligence.
    LIN-LI Xing, born in 1997, M. S. candidate. His research interests include machine vision, deep learning.
    ZHANG Bowen, born in 1998, M. S. candidate. His research interests include deep learning, digital image processing
  • Supported by:
    National Natural Science Foundation of China(62002254);Natural Science Foundation of Jiangsu Province(BK20200988);Postgraduate Research and Practice Innovation Program of Jiangsu Province(SJCX21_1424)

基于Transformer-CNN的轻量级图像超分辨率重建网络

陈豪1, 夏振平1(), 程成1, 林李兴2, 张博文1   

  1. 1.苏州科技大学 电子与信息工程学院,江苏 苏州 215009
    2.苏州科技大学 物理科学与技术学院,江苏 苏州 215009
  • 通讯作者: 夏振平
  • 作者简介:陈豪(2000—),男,四川广安人,硕士研究生,主要研究方向:深度学习、数字图像处理;
    程成(1980—),男,江苏苏州人,讲师,博士,主要研究方向:机器学习、人工智能;
    林李兴(1997—),男,福建三明人,硕士研究生,主要研究方向:机器视觉、深度学习;
    张博文(1998—),男,江苏南通人,硕士研究生,主要研究方向:深度学习、数字图像处理。
    第一联系人:夏振平(1985—),男,江苏兴化人,副教授,博士,主要研究方向:图像质量测量、评价及优化;
  • 基金资助:
    国家自然科学基金资助项目(62002254);江苏省自然科学基金资助项目(BK20200988);江苏省研究生科研与实践创新计划项目(SJCX21_1424)

Abstract:

Aiming at the high computational complexity and large memory consumption of the existing super-resolution reconstruction networks, a lightweight image super-resolution reconstruction network based on Transformer-CNN was proposed, which made the super-resolution reconstruction network more suitable to be applied on embedded terminals such as mobile platforms. Firstly, a hybrid block based on Transformer-CNN was proposed, which enhanced the ability of the network to capture local-global depth features. Then, a modified inverted residual block, with special attention to the characteristics of the high-frequency region, was designed, so that the improvement of feature extraction ability and reduction of inference time were realized. Finally, after exploring the best options for activation function, the GELU (Gaussian Error Linear Unit) activation function was adopted to further improve the network performance. Experimental results show that the proposed network can achieve a good balance between image super-resolution performance and network complexity, and reaches inference speed of 91 frame/s on the benchmark dataset Urban100 with scale factor of 4, which is 11 times faster than the excellent network called SwinIR (Image Restoration using Swin transformer), indicates that the proposed network can efficiently reconstruct the textures and details of the image and reduce a significant amount of inference time.

Key words: image super-resolution, deep learning, Transformer, Convolutional Neural Network (CNN), lightweight

摘要:

针对现有超分辨率重建网络具有较高的计算复杂度和存在大量内存消耗的问题,提出了一种基于Transformer-CNN的轻量级图像超分辨率重建网络,使超分辨率重建网络更适合应用于移动平台等嵌入式终端。首先,提出了一个基于Transformer-CNN的混合模块,从而增强网络捕获局部-全局深度特征的能力;其次,提出了一个改进的倒置残差块来特别关注高频区域的特征,以提升特征提取能力和减少推理时间;最后,在探索激活函数的最佳选择后,采用GELU (Gaussian Error Linear Unit)激活函数来进一步提高网络性能。实验结果表明,所提网络可以在图像超分辨率性能和网络复杂度之间取得很好的平衡,而且在基准数据集Urban100上4倍超分辨率的推理速度达到91 frame/s,比优秀网络SwinIR (Image Restoration using Swin transformer)快11倍,表明所提网络能够高效地重建图像的纹理和细节,并减少大量的推理时间。

关键词: 图像超分辨率, 深度学习, Transformer, 卷积神经网络, 轻量级

CLC Number: