《计算机应用》唯一官方网站 ›› 0, Vol. ›› Issue (): 217-222.DOI: 10.11772/j.issn.1001-9081.2024040492

• 多媒体计算与计算机仿真 • 上一篇    下一篇

基于邻域注意力的混合注意力图像超分辨率网络

孙超1, 王强2, 杨大为1()   

  1. 1.沈阳理工大学 信息科学与工程学院,沈阳 110159
    2.沈阳大学 信息工程学院,沈阳 110044
  • 收稿日期:2024-04-23 修回日期:2024-07-08 接受日期:2024-07-10 发布日期:2025-01-24 出版日期:2024-12-31
  • 通讯作者: 杨大为
  • 作者简介:孙超(1998—),男,山东威海人,硕士研究生,主要研究方向:图像超分辨率、深度学习
    王强(1982—),男,山东济南人,副教授,博士研究生,主要研究方向:图像恢复、目标识别、行人识别
    杨大为(1976—),男,辽宁绥中人,教授,博士,主要研究方向:数字图像处理、机器学习。

Mixed attention image super-resolution network based on neighborhood attention

Chao SUN1, Qiang WANG2, Dawei YANG1()   

  1. 1.College of Information Science and Engineering,Shenyang Ligong University,Shenyang Liaoning 110159,China
    2.College of Information Engineering,Shenyang University,Shenyang Liaoning 110044,China
  • Received:2024-04-23 Revised:2024-07-08 Accepted:2024-07-10 Online:2025-01-24 Published:2024-12-31
  • Contact: Dawei YANG

摘要:

针对基于Transformer的超分辨率网络无法充分利用周围信息的问题,提出一种基于邻域注意力的Transformer混合注意力图像超分辨率网络(MAT)。首先,利用一个卷积层提取浅层特征,并利用一系列残差混合注意组(RMAG)和一个3×3卷积层进行深度特征提取,从而充分利用邻域注意力和通道注意力这2种方法的互补优势,即能够同时利用全局统计量和较强的局部拟合的能力;此外,引入重叠的交叉注意力模块增强相邻窗口特征之间的交互作用;其次,添加一个全局残差连接,以融合浅层特征和深层特征;最后,重构模块采用像素混洗法对融合后的特征进行上采样。在多个数据集上,MAT与RCAN(Residual Channel Attention Network)-it等多个算法的实验对比结果表明,MAT的峰值信噪比(PSNR)比先进方法提高0.3~1.0 dB。可见,MAT在图像超分辨率任务中有效提高了图像的恢复效果。

关键词: Transformer, 图像超分辨率, 邻域注意力, 通道注意力, 交叉注意力

Abstract:

To solve the problem that the Transformer-based super-resolution network cannot fully utilize the surrounding information, a Mixed Attention Transformer image super-resolution network (MAT) based on neighborhood attention was proposed. Firstly, a convolutional layer was used to extract shallow features, and a series of Residual Mixed Attention Group (RMAG) and a 3×3 convolutional layer were used for deep feature extraction. In this way, the neighborhood attention and channel attention methods were combined, making full use of the complementary advantages of the two methods, that was the ability to utilize global statistics and strong local fitting simultaneously. In addition, an overlapping cross-attention module was introduced to enhance the interaction between adjacent window features. Secondly, a global residual connection was added to fuse shallow features and deep features. Finally, with pixel shuffling method adopted, the reconstruction module was used to upsample the fused features. Experimental comparison results of MAT and multiple algorithms such as RCAN (Residual Channel Attention Network)-it on multiple datasets show that the Peak Signal-to-Noise Ratio (PSNR) of the proposed algorithm is significantly higher than the advanced methods by 0.3 to 1.0 dB. It can be seen that MAT improves the image restoration effect in image super-resolution tasks effectively.

Key words: Transformer, image super-resolution, neighborhood attention, channel attention, cross-attention

中图分类号: