Frequency-domain driven and diffusion fused sonar image enhancement algorithm

doi:10.11772/j.issn.1001-9081.2025060678

Abstract

Abstract: To address the issues of low contrast, severe noise interference, and limited resolution in sonar images under complex marine environments, as well as the limitations of existing algorithms that mainly operate in the pixel domain and thus lack effective feature extraction, a frequency-domain driven and diffusion fused sonar image enhancement algorithm was proposed, in which the image was enhanced by utilizing its frequency-domain features .Specifically, the algorithm comprised three components: a compact feature extraction network, a frequency-domain diffusion module, and a frequency recovery fusion module . First, a compact feature extraction network was designed to optimize and compress channel?redundant features, effectively suppressing disturbances caused by marine turbulence and acoustic artifacts .Then, a frequency-domain diffusion module was incorporated, in which a diffusion generation submodule was used to train, infer, and reconstruct the images; meanwhile, a selective attention feature enhancement submodule was employed to maintain key information integrity while improving inference speed and reducing computational resource consumption, thereby enhancing the accuracy of the generated images .Finally, a frequency recovery fusion module was employed to adaptively fuse the low?frequency and diagonal?direction information of the images, enhancing representation of horizontal and vertical edge details, and ultimately enabling clearer target contours and texture details. Experimental results on public sonar datasets show that the proposed algorithm achieves optimal Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index Measure (SSIM) values of 29.93dB and 0.898, respectively, surpassing the second-best algorithms PATM (Pixel Attention Transform Mechanism) and FlowIE (Flow-based Image Enhancement framework) by 8% and 5%. In addition, the Learned Perceptual Image Patch Similarity (LPIPS) reaches the lowest value of 0.103, which is a 33% reduction compared to the second-best FlowIE. These results demonstrate that the proposed algorithm provides superior image enhancement quality and perceptual consistency in sonar image enhancement tasks.

Key words: sonar image enhancement, wavelet transform, diffusion model, attention mechanism, self-adaption

摘要： 针对复杂海洋环境中声纳图像存在的对比度低、噪声干扰严重以及分辨率受限等问题，且现有算法多局限于像素空间处理，导致在特征提取方面存在不足，提出一种基于频域驱动及扩散融合的声纳图像增强算法，利用图像的频域特征对图像进行增强。具体来说，算法主要包含三个部分：紧凑特征提取网络，频域扩散模块和频率恢复融合模块。首先，设计紧凑特征提取网络对通道冗余特征进行优化压缩，有效压制了海洋湍流与声学伪影等所带来的干扰。其次，结合频域扩散模块，其中扩散生成子模块对图像进行训练，推理与重建；选择性注意力特征增强子模块在保持关键信息完整性的同时提升推理速度并降低计算资源消耗，提升生成图像的精确度。最后，频率恢复融合模块通过自适应融合图像低频与对角线方向信息，强化水平及垂直边缘细节表征能力，最终实现更为清晰的目标轮廓及细节纹理。在公开声纳数据集上实验表明，所提算法的峰值信噪比(PSNR)和结构相似性度(SSIM)分别达到了最优值29.93dB和0.898，相比次优算法PATM（Pixel Attention Transform Mechanism）和FlowIE（Flow-based Image Enhancement framework）提升了8%和5%，学习感知图像块相似度（LPIPS）也达到最低值0.103，相比次优算法FlowIE降低了33%。上述结果表明，所提算法在声纳图像增强任务中具有更优的图像增强质量与感知一致性。

关键词: 声纳图像增强, 小波变换, 扩散模型, 注意力机制, 自适应

CLC Number:

TP391.41

姚力挽刘海龙曾张帆. 基于频域驱动及扩散融合的声纳图像增强算法[J]. 《计算机应用》唯一官方网站, DOI: 10.11772/j.issn.1001-9081.2025060678.

[1]	Yilin DENG, Fajiang YU. Pseudo random number generator based on LSTM and separable self-attention mechanism [J]. Journal of Computer Applications, 2025, 45(9): 2893-2901.
[2]	Jinggang LYU, Shaorui PENG, Shuo GAO, Jin ZHOU. Speech enhancement network driven by complex frequency attention and multi-scale frequency enhancement [J]. Journal of Computer Applications, 2025, 45(9): 2957-2965.
[3]	Weigang LI, Jiale SHAO, Zhiqiang TIAN. Point cloud classification and segmentation network based on dual attention mechanism and multi-scale fusion [J]. Journal of Computer Applications, 2025, 45(9): 3003-3010.
[4]	Xiang WANG, Zhixiang CHEN, Guojun MAO. Multivariate time series prediction method combining local and global correlation [J]. Journal of Computer Applications, 2025, 45(9): 2806-2816.
[5]	Fang WANG, Jing HU, Rui ZHANG, Wenting FAN. Medical image segmentation network with content-guided multi-angle feature fusion [J]. Journal of Computer Applications, 2025, 45(9): 3017-3025.
[6]	Xuejin WANG, Leilei HUANG, Zhenhui ZHONG. Noise and semantic prior guided low-light image enhancement algorithm [J]. Journal of Computer Applications, 2025, 45(9): 2966-2974.
[7]	Yimeng XI, Zhen DENG, Qian LIU, Libo LIU. Cross-modal information fusion for video-text retrieval [J]. Journal of Computer Applications, 2025, 45(8): 2448-2456.
[8]	Chao JING, Yutao QUAN, Yan CHEN. Improved multi-layer perceptron and attention model-based power consumption prediction algorithm [J]. Journal of Computer Applications, 2025, 45(8): 2646-2655.
[9]	Jinhao LIN, Chuan LUO, Tianrui LI, Hongmei CHEN. Thoracic disease classification method based on cross-scale attention network [J]. Journal of Computer Applications, 2025, 45(8): 2712-2719.
[10]	Xingjie FENG, Xingpeng BIAN, Xiaorong FENG, Xinglong WANG. Incremental missing value imputation algorithm for time series based on diffusion model [J]. Journal of Computer Applications, 2025, 45(8): 2582-2591.
[11]	Yanhua LIAO, Yuanxia YAN, Wenlin PAN. Multi-target detection algorithm for traffic intersection images based on YOLOv9 [J]. Journal of Computer Applications, 2025, 45(8): 2555-2565.
[12]	Haifeng WU, Liqing TAO, Yusheng CHENG. Partial label regression algorithm integrating feature attention and residual connection [J]. Journal of Computer Applications, 2025, 45(8): 2530-2536.
[13]	Jin ZHOU, Yuzhi LI, Xu ZHANG, Shuo GAO, Li ZHANG, Jiachuan SHENG. Modulation recognition network for complex electromagnetic environments [J]. Journal of Computer Applications, 2025, 45(8): 2672-2682.
[14]	Yihan WANG, Chong LU, Zhongyuan CHEN. Multimodal sentiment analysis model with cross-modal text information enhancement [J]. Journal of Computer Applications, 2025, 45(7): 2237-2244.
[15]	Haoyu LIU, Pengwei KONG, Yaoli WANG, Qing CHANG. Pedestrian detection algorithm based on multi-view information [J]. Journal of Computer Applications, 2025, 45(7): 2325-2332.

Frequency-domain driven and diffusion fused sonar image enhancement algorithm

基于频域驱动及扩散融合的声纳图像增强算法

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics