Journal of Computer Applications ›› 2026, Vol. 46 ›› Issue (6): 1947-1955.DOI: 10.11772/j.issn.1001-9081.2025060678

• Multimedia computing and computer simulation • Previous Articles    

Frequency-domain driven and diffusion-based fusion for sonar image enhancement algorithm

Liwan YAO, Hailong LIU(), Zhangfan ZENG   

  1. School of Artificial Intelligence,Hubei University,Wuhan Hubei 430062,China
  • Received:2025-06-19 Revised:2025-08-19 Accepted:2025-08-22 Online:2025-09-05 Published:2026-06-10
  • Contact: Hailong LIU
  • About author:YAO Liwan, born in 2002, M. S. candidate. His research interests include sonar image processing.
    ZENG Zhangfan, born in 1983, Ph. D., associate professor. His research interests include remote sensing image processing, wireless communication, digital signal processing.
    First author contact:LIU Hailong,born in 1989, Ph. D., lecturer. His research interests include sonar image processing, deep learning.
  • Supported by:
    General Program of Natural Science Foundation of Hubei Province(2024AFB933)

基于频域驱动及扩散融合的声纳图像增强算法

姚力挽, 刘海龙(), 曾张帆   

  1. 湖北大学 人工智能学院,武汉 430062
  • 通讯作者: 刘海龙
  • 作者简介:姚力挽(2002—),男,湖北随州人,硕士研究生,主要研究方向:声纳图像处理
    曾张帆(1983—),男,湖北武汉人,副教授,博士,CCF会员,主要研究方向:遥感图像处理、无线通信、数字信号处理。
    第一联系人:刘海龙(1989—),男,湖北随州人,讲师,博士,主要研究方向:声呐图像处理、深度学习
  • 基金资助:
    湖北省自然科学基金面上项目(2024AFB933)

Abstract:

To address the issues of low contrast, severe noise interference, and limited resolution in sonar images under complex marine environments, as well as the limitation of the existing algorithms that mainly limit in the pixel domain processing and thus lack effective feature extraction, a Frequency-domain driven and Diffusion-based fusion for Sonar Image Enhancement algorithm (FDSIE) was proposed, so as to enhance the image by utilizing its frequency-domain features. Specifically, the algorithm comprises three components: a Compact Feature Extraction Network (CFEN), a Frequency-Domain Diffusion Module (FDDM), and a Frequency Recovery Fusion Module (FRFM). Firstly, the CFEN was designed to optimize and compress channel redundant features, effectively suppressing disturbances caused by ocean turbulence and acoustic artifacts. Then, the FDDM was incorporated, in which the diffusion generation submodule was used to train, infer, and reconstruct the images; the Selective Attention Feature Enhancement module (SAFE) was employed to maintain key information integrity while improving inference speed and reducing computational resource consumption, thereby enhancing accuracy of the generated images. Finally, the FRFM was employed to fuse the low?frequency and diagonal?direction information of the images adaptively, thereby improving representation abilities of horizontal and vertical edge details, and ultimately obtaining clearer target contours and texture details. Experimental results on public sonar dataset UATD (Underwater Acoustic Target Detection) show that the proposed algorithm achieves optimal Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index Measure (SSIM) values of 29.93 dB and 0.898, respectively, surpassing the second-best algorithms Pixel Attention Transform Mechanism (PATM) and FlowIE (Flow-based Image Enhancement framework) by 8% and 5%, respectively. In addition, the proposed algorithm achieves the Learned Perceptual Image Patch Similarity (LPIPS) reached the lowest value of 0.103, which is reduced by 34% compared to that of the second-best algorithm FlowIE. These results demonstrate that the proposed algorithm provides superior image enhancement quality and perceptual consistency in sonar image enhancement tasks.

Key words: sonar image enhancement, wavelet transform, diffusion model, attention mechanism, self-adaption

摘要:

针对复杂海洋环境中声纳图像存在的对比度低、噪声干扰严重以及分辨率受限等问题,现有算法多局限于像素空间处理,导致在特征提取方面存在不足,为此提出一种基于频域驱动及扩散融合的声纳图像增强算法(FDSIE),以利用图像的频域特征增强图像。具体地,该算法主要包含3个部分:紧凑特征提取网络(CFEN)、频域扩散模块(FDDM)和频率恢复融合模块(FRFM)。首先,设计CFEN对通道冗余特征进行优化压缩,从而有效压制海洋湍流与声学伪影等所带来的干扰;其次,结合FDDM,其中扩散生成子模块对图像进行训练、推理和重建,选择性注意力特征增强模块(SAFE)在保持关键信息完整性的同时提升推理速度并降低计算资源消耗,提升生成图像的精确度;最后,FRFM通过自适应融合图像低频与对角线方向信息,强化水平及垂直边缘细节的表征能力,最终获得更清晰的目标轮廓及细节纹理。在公开的声纳数据集UATD(Underwater Acoustic Target Detection)上的实验结果表明,所提算法的峰值信噪比(PSNR)和结构相似性(SSIM)分别达到了最优值29.93 dB和0.898,相较于次优算法像素注意力转换机制(PATM)和FlowIE(Flow-based Image Enhancement framework)分别提升了8%和5%,而所提算法的学习感知图像块相似度(LPIPS)达到最低值0.103,相较于次优算法FlowIE降低了34%。上述结果表明,所提算法在声纳图像增强任务中具有更优的图像增强质量与感知一致性。

关键词: 声纳图像增强, 小波变换, 扩散模型, 注意力机制, 自适应

CLC Number: