《计算机应用》唯一官方网站

• •    下一篇

基于SwinGAN的生成对抗网络水下图像增强方法

李慧,贾炳志,王晨曦,董子宇,李纪龙,仲兆满,陈艳艳   

  1. 江苏海洋大学 计算机工程学院,江苏 连云港 222005
  • 收稿日期:2024-06-03 修回日期:2024-07-12 发布日期:2024-08-12 出版日期:2024-08-12
  • 通讯作者: 陈艳艳
  • 基金资助:
    国家自然科学基金项目(编号:72174079);连云港市第六期 “521”项目(LYG06521202351);连云港市科技计划项
    目(CG2325)

Underwater image enhancement method based on SwinGAN generative adversarial network

LI Hui, JIA Bingzhi, WANG Chenxi, DONG Ziyu, LI Jilong, ZHONG Zhaoman, CHEN Yanyan* #br#   

  • Received:2024-06-03 Revised:2024-07-12 Online:2024-08-12 Published:2024-08-12
  • Supported by:
    This work is partially supported by National Natural Science
    Foundation of China project (72174079); sixth phase of the
    "521" project in Lianyungang City (LYG06521202351);
    Lianyungang Science and Technology Plan Project (CG2325)

摘要: 针对水下图像对比度低、噪声大和色彩偏差等问题,以生成对抗网络(GAN)作为核心框架,提出一种基于 Swin
Transformer
的生成对抗网络水下图像增强模型 SwinGAN(Generative Adversarial Networks Based on Swin Transformer)。首先,
生成网络部分遵循编码器
-瓶颈层-解码器的结构设计,在瓶颈层将输入的特征图分割成多个不重叠的局部窗口;其次,引入双
路窗口多头自注意力机制,在加强捕获全局信息和长距离依赖关系的同时,增强局部注意力;最后,在解码器经过多个窗口
重新组合成原始尺寸的特征图,判别网络采用马尔可夫判别器。本模型与
URSCT-SESR 模型相比,在 UFO-120 数据集上,所
提出模型在峰值信噪比
(PSNR)上提升了 0.8376 dB,结构相似度(SSIM)提高了 0.0036。在 EUVP-515 数据集上,所提模型峰值
信噪比
(PSNR)提升达到 0.8439 dB;结构相似度(SSIM)提高了 0.0051;水下图像质量评价标准(UIQM)增加了 0.1124;水下彩色
图像质量评价指标
(UCIQE)略有上升 0.001。实验结果表明,所提模型主观评价以及客观评价指标都拥有出色表现,在改善水
下图像的色彩偏差问题上取得了不错的效果。

关键词: 水下图像增强, Swin Transformer, 生成对抗网络, 多头自注意力机制, 马尔科夫判别器

Abstract: Addressing the issues of low contrast, heavy noise, and color distortion in underwater images, a Generative
Adversarial Network (GAN) model, is proposed as the core framework for underwater image enhancement. First, the generative
network follows an encoder-bottleneck-decoder structure, where the input feature maps are divided into multiple non-overlapping local
windows at the bottleneck layer. Second, a dual-path window multi-head self-attention mechanism is introduced to enhance local
attention while simultaneously capturing global information and long-range dependencies. Finally, the decoder recombines the multiple
windows back into the original size feature maps. The discriminator network employs a Markov discriminator. Compared to the
URSCT-SESR model, the proposed model shows an improvement of 0.8376 dB in Peak Signal-to-Noise Ratio (PSNR) and 0.0036 in
Structural Similarity Index (SSIM) on the UFO-120 data set. On the EUVP-515 dataset, the model achieves more significant
improvement, with a 0.8439 dB boost in PSNR, a SSIM increase of 0.0051, a 0.1124 enhancement in the Underwater Image Quality
Measure (UIQM), and a slight increase of 0.001 in the Underwater Color Image Quality Evaluation (UCIQE). Experimental results
demonstrate that the proposed model excels in both subjective and objective evaluation metrics, achieving notable improvements in
correcting color deviations in underwater images.

Key words: underwater image enhancement, Swin Transformer, Generative Adversarial Network (GAN), multi-head attention mechanism, patch discriminator(PatchGAN)

中图分类号: