Journal of Computer Applications ›› 2024, Vol. 44 ›› Issue (5): 1588-1596.DOI: 10.11772/j.issn.1001-9081.2023050636

Special Issue: 多媒体计算与计算机仿真

• Multimedia computing and computer simulation • Previous Articles     Next Articles

Image super-resolution network based on global dependency Transformer

Zihan LIU(), Dengwen ZHOU, Yukai LIU   

  1. School of Control and Computer Engineering,North China Electric Power University,Beijing 102206,China
  • Received:2023-05-23 Revised:2023-08-31 Accepted:2023-09-13 Online:2023-09-19 Published:2024-05-10
  • Contact: Zihan LIU
  • About author:ZHOU Dengwen, born in 1965, M. S., professor. His research interests include image denoising, image super-resolution.
    LIU Yukai, born in 1996, M. S. candidate. His research interests include deep learning, computer vision.


刘子涵(), 周登文, 刘玉铠   

  1. 华北电力大学 控制与计算机工程学院,北京 102206
  • 通讯作者: 刘子涵
  • 作者简介:周登文(1965—),男,湖北黄梅人,教授,硕士,主要研究方向:图像去噪、图像超分辨率


At present, the image super-resolution networks based on deep learning are mainly implemented by convolution. Compared with the traditional Convolutional Neural Network (CNN), the main advantage of Transformer in the image super-resolution task is its long-distance dependency modeling ability. However, most Transformer-based image super-resolution models cannot establish global dependencies with small parameters and few network layers, which limits the performance of the model. In order to establish global dependencies in super-resolution network, an image Super-Resolution network based on Global Dependency Transformer (GDTSR) was proposed. Its main component was the Residual Square Axial Window Block (RSAWB), and in Transformer residual layer, axial window and self-attention were used to make each pixel globally dependent on the entire feature map. In addition, the super-resolution image reconstruction modules of most current image super-resolution models are composed of convolutions. In order to dynamically integrate the extracted feature information, Transformer and convolution were combined to jointly reconstruct super-resolution images. Experimental results show that the Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index (SSIM) of GDTSR on five standard test sets, including Set5, Set14, B100, Urban100 and Manga109, are optimal for three multiples (×2×3×4), and on large-scale datasets Urban100 and Manga109, the performance improvement is especially obvious.

Key words: image super-resolution, Transformer, self-attention, global dependency, axial window



关键词: 图像超分辨率, Transformer, 自注意力, 全局依赖, 轴向窗口

CLC Number: