CSAF-YOLO： improved YOLO11 algorithm for underwater small object detection

doi:10.11772/j.issn.1001-9081.2025101310

Journal of Computer Applications ›› 2026, Vol. 46 ›› Issue (5): 1578-1585.DOI: 10.11772/j.issn.1001-9081.2025101310

• Multimedia computing and computer simulation • Previous Articles

CSAF-YOLO： improved YOLO11 algorithm for underwater small object detection

Hongrui ZHANG¹^,², Weiming FENG¹^,², Luxia YANG¹^,²(), Yongjie MA³

^1.College of Computer Science and Technology，Taiyuan Normal University，Jinzhong Shanxi 030619，China
^2.Shanxi Provincial Key Laboratory of Intelligent Optimization Computing and Blockchain Technology （Taiyuan Normal University），Jinzhong Shanxi 030619，China
^3.College of Physics and Electronic Engineering，Northwest Normal University，Lanzhou Gansu 730070，China

Received:2025-11-10 Revised:2025-12-25 Accepted:2026-01-04 Online:2026-01-08 Published:2026-05-10
Contact: Luxia YANG
About author:ZHANG Hongrui， born in 1992， Ph. D.， lecturer. Her research interests include machine vision tasks in intelligent transportation systems.
FENG Weiming， born in 2000， M. S. candidate. His research interests include machine vision.
MA Yongjie， born in 1967， Ph. D.， professor. His research interests include computer measurement and control， evolutionary algorithms.
Supported by:
National Natural Science Foundation of China(62066041);Key Research and Development Program of Shanxi Province(202102010101008);Science and Technology Innovation Project of Shanxi Higher Education Institutions(2024L295);Key Project of Shanxi Science and Technology Strategy Research Special Program(202304031401011);Shanxi Basic Research Program （Free Exploration Category）(202403021222276);Graduate Education Innovation Project of Taiyuan Normal University in 2025(SYYJSYC-2597)

基于YOLO11改进的水下小目标检测算法CSAF-YOLO

张红瑞¹^,², 冯威铭¹^,², 杨潞霞¹^,²(), 马永杰³

^1.太原师范学院计算机科学与技术学院，山西晋中 030619
^2.智能优化计算与区块链技术山西省重点实验室（太原师范学院），山西晋中 030619
^3.西北师范大学物理与电子工程学院，兰州 730070

通讯作者: 杨潞霞
作者简介:张红瑞（1992—），女，山西孝义人，讲师，博士，主要研究方向：智能交通系统中的机器视觉任务
冯威铭（2000—），男，山西运城人，硕士研究生，主要研究方向：机器视觉
马永杰（1967—），男，甘肃灵台人，教授，博士，主要研究方向：计算机测量与控制、进化算法。

Abstract

Abstract:

To address challenges in underwater small object detection， such as light scattering， low contrast， and complex background， an underwater small object detection algorithm named CSAF-YOLO （Cross-Scale Adaptive Fusion YOLO） was proposed based on YOLO11. Firstly， a Multi-Scale Collaborative Fusion （MSCF） module was designed to enhance cross-scale feature synergy and contextual information extraction through spatial fusion and channel interaction mechanisms. Secondly， a Dynamic Kernel Scale Modulation （DKSM） module was constructed to adaptively generate local and global modulation matrices， optimizing convolutional kernels for improved adaptability to complex underwater environments. Thirdly， a Multi-Scale Enhanced detection Head （MSE-Head） was proposed to improve small-object localization accuracy via scale-aware enhancement and dynamic cross-scale feature fusion. Finally， the MPDIoU （Modified Penalized Distance Intersection over Union） loss function was introduced to optimize bounding box regression for underwater small objects through minimum point distance and multi-scale penalty mechanisms. Experimental results on the URPC2020 dataset demonstrate that CSAF-YOLO achieves an mAP₅₀ （mean Average Precision at 50% Intersection over Union （IoU） threshold） of 85.0%， representing an improvement of 1.6 percentage points over YOLO11. The proposed algorithm provides an effective solution for visual tasks in fields such as marine resource exploration and underwater robotic navigation.

Key words: underwater small object detection, YOLO11, multi-scale feature fusion, dynamic kernel modulation, attention mechanism

摘要：

针对水下小目标检测中光线散射、低对比度和复杂背景等挑战，提出一种基于YOLO11的改进算法CSAF-YOLO（Cross-Scale Adaptive Fusion YOLO）。首先，设计多尺度协同融合（MSCF）模块，通过空间融合与通道交互机制，增强多尺度特征间的协同作用，提升上下文信息提取能力；其次，构建动态内核尺度调制（DKSM）模块，自适应生成局部与全局调制矩阵，优化卷积核以增强模型对复杂水下环境的适应性；再次，提出多尺度增强检测头（MSE-Head），通过尺度感知增强和跨尺度特征动态融合，提高小目标定位精度；最后，引入MPDIoU（Modified Penalized Distance Intersection over Union）损失函数，通过最小点距离和多尺度惩罚机制，优化水下小型目标的边界框回归。在URPC2020数据集上的实验结果表明，CSAF-YOLO在50%交并比（IoU）阈值下的平均精度均值（mAP₅₀）达到了85.0%，比YOLO11高1.6个百分点，可为海洋资源勘探和水下机器人导航等领域的视觉任务提供有效的解决方案。

关键词: 水下小目标检测, YOLO11, 多尺度特征融合, 动态内核调制, 注意力机制

CLC Number:

TP391.41

Hongrui ZHANG, Weiming FENG, Luxia YANG, Yongjie MA. CSAF-YOLO： improved YOLO11 algorithm for underwater small object detection[J]. Journal of Computer Applications, 2026, 46(5): 1578-1585.

张红瑞, 冯威铭, 杨潞霞, 马永杰. 基于YOLO11改进的水下小目标检测算法CSAF-YOLO[J]. 《计算机应用》唯一官方网站, 2026, 46(5): 1578-1585.

Figures/Tables 18

References 35

[1]	JOSHI R， USMANI K， KRISHNAN G， et al. Underwater object detection and temporal signal detection in turbid water using 3D‑integral imaging and deep learning［J］. Optics Express， 2024， 32（2）： 1789-1801.
[2]	CHOI J Y， HAN J M. Deep learning （Fast R-CNN）-based evaluation of rail surface defects［J］. Applied Sciences， 2024， 14（5）： No.1874.
[3]	XU X， ZHAO M， SHI P， et al. Crack detection and comparison study based on Faster R-CNN and Mask R-CNN［J］. Sensors， 2022， 22（3）： No.1215.
[4]	ZHAI S， SHANG D， WANG S， et al. DF-SSD： an improved SSD object detection algorithm based on DenseNet and feature fusion［J］. IEEE Access， 2020， 8： 24344-24357.
[5]	CHEN L， ZHOU Y， XU S. ERetinaNet： an efficient neural network based on RetinaNet for mammographic breast mass detection［J］. IEEE Journal of Biomedical and Health Informatics， 2024， 28（5）： 2866-2878.
[6]	LIU K， PENG L， TANG S. Underwater object detection using TC‑YOLO with attention mechanisms［J］. Sensors， 2023， 23（5）： No.2567.
[7]	SUN Y， ZHENG W， DU X， et al. Underwater small target detection based on YOLOX combined with MobileViT and double coordinate attention［J］. Journal of Marine Science and Engineering， 2023， 11（6）： No.1178.
[8]	GE H， DAI Y， ZHU Z， et al. Single-stage underwater target detection based on feature anchor frame double optimization network［J］. Sensors， 2022， 22（20）： No.7875.
[9]	HUA X， CUI X， XU X， et al. Underwater object detection algorithm based on feature enhancement and progressive dynamic aggregation strategy［J］. Pattern Recognition， 2023， 139： No.109511.
[10]	MI Y， CHI M， ZHANG Q， et al. Research on multi-scale fusion image enhancement and improved YOLOv5s lightweight ROV underwater target detection method［J］. Scientific Reports， 2024， 14： No.28280.
[11]	CAI S， ZHANG X， MO Y. A lightweight underwater detector enhanced by attention mechanism， GSConv and WIoU on YOLOv8［J］. Scientific Reports， 2024， 14： No.25797.
[12]	MA S， XU Y. MPDIoU： a loss for efficient and accurate bounding box regression［EB/OL］. ［2025-07-16］..
[13]	HUANG J， WANG K， HOU Y， et al. LW-YOLO11： a lightweight arbitrary-oriented ship detection method based on improved YOLO11［J］. Sensors， 2025， 25（1）： No.65.
[14]	HIDAYATULLAH P， SYAKRANI N， SHOLAHUDDIN M R， et al. YOLOv8 to YOLO11： a comprehensive architecture in-depth comparative review［EB/OL］. ［2025-09-07］..
[15]	LI T， GANG Y， LI S， et al. A small underwater object detection model with enhanced feature extraction and fusion［J］. Scientific Reports， 2025， 15： No.2396.
[16]	ZHOU K， JIANG S. Forest fire detection algorithm based on improved YOLOv11n［J］. Sensors， 2025， 25（10）： No.2989.
[17]	WU H， NI N， ZHANG L. Learning dynamic scale awareness and global implicit functions for continuous-scale super-resolution of remote sensing images［J］. IEEE Transactions on Geoscience and Remote Sensing， 2023， 61： No.5602315.
[18]	刘雄彪，杨宪昭，陈洋，等.基于CIoU改进边界框损失函数的目标检测方法［J］.液晶与显示，2023，38（5）：656-665.
	LIU X B， YANG X Z， CHEN Y， et al. Object detection method based on CIoU improved bounding box loss function［J］. Chinese Journal of Liquid Crystals and Displays， 2023， 38（5）： 656-665.
[19]	DU S， ZHANG B， ZHANG P， et al. An improved bounding box regression loss function based on CIoU loss for multi-scale object detection［C］// Proceedings of the IEEE 2nd International Conference on Pattern Recognition and Machine Learning. Piscataway： IEEE， 2021： 92-98.
[20]	HU Z， CHENG L， YU S， et al. Underwater target detection with high accuracy and speed based on YOLOv10［J］. Journal of Marine Science and Engineering， 2025， 13（1）： No.135.
[21]	梁秀满，赵佳阳，于海峰.基于YOLOv8的轻量化水下目标检测算法［J］.红外技术，2024，46（9）：1015-1024.
	LIANG X M， ZHAO J Y， YU H F. Lightweight underwater target detection algorithm based on YOLOv8［J］. Infrared Technology， 2024， 46（9）： 1015-1024.
[22]	方侦波，高向阳，张锲石，等.基于改进YOLO11的水下目标检测模型［J］.电子测量技术，2025，48（15）：159-167.
	FANG Z B， GAO X Y， ZHANG Q S， et al. Underwater object detection model based on improved YOLO11［J］. Electronic Measurement Technology， 2025， 48（15）： 159-167.
[23]	ANITHA M， SELVY P T. A multi-context squeeze-excitation framework with explainable attention for cervical spine fracture detection in CT imaging［J/OL］. Iranian Journal of Science and Technology， Transactions of Electrical Engineering， 2025 ［2025-11-12］. .
[24]	JIANG P， ZHANG J， CHEN J. Enhanced rain removal network with Convolutional Block Attention Module （CBAM）： a novel approach to image de-raining［J］. EURASIP Journal on Advances in Signal Processing， 2025， 2025： No.9.
[25]	OUYANG D， HE S， ZHANG G， et al. Efficient multi-scale attention module with cross-spatial learning［C］// Proceedings of the 2023 IEEE International Conference on Acoustics， Speech and Signal Processing. Piscataway： IEEE， 2023： 1-5.
[26]	YANG L， GU Y， FENG H. Multi-scale feature fusion and feature calibration with edge information enhancement for remote sensing object detection［J］. Scientific Reports， 2025， 15： No.15371.
[27]	WANG Y， ZHANG J， ZHOU J. Urban traffic tiny object detection via attention and multi-scale feature driven in UAV-vision［J］. Scientific Reports， 2024， 14： No.20614.
[28]	WU Y， GENG L， GUO X， et al. An improved YOLOv11n model based on wavelet convolution for object detection in soccer scenes［J］. Symmetry， 2025， 17（10）： No.1612.
[29]	WU T， XU W， WU Y. A lightweight high-frequency mamba network for image super-resolution［J］. Scientific Reports， 2025， 15： No.25973.
[30]	DOHERTY J， GARDINER B， KERR E， et al. BiFPN-YOLO： one-stage object detection integrating bi-directional feature pyramid networks［J］. Pattern Recognition， 2025， 160： No.111209.
[31]	MOHAMMED A， IBRAHIM H M， OMAR N M. Optimizing RetinaNet anchors using differential evolution for improved object detection［J］. Scientific Reports， 2025， 15： No.20101.
[32]	CHEN J， ER M J. Dynamic YOLO for small underwater object detection［J］. Artificial Intelligence Review， 2024， 57（7）： No.165.
[33]	WANG H， ZHANG Y， ZHU C. DAFPN-YOLO： an improved UAV-based object detection algorithm based on YOLOv8s［J］. Computers， Materials and Continua， 2025， 83（2）： 1929-1949.
[34]	DAI X， CHEN Y， XIAO B， et al. Dynamic head： unifying object detection heads with attentions［C］// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2021： 7373-7382.
[35]	SRIVASTAVA S， DIVEKAR A V， ANILKUMAR C， et al. Comparative analysis of deep learning image detection algorithms［J］. Journal of Big Data， 2021， 8： No.66.

参数	设置	参数	设置
Batch_Size	32	workers	4
Image_Size	640×640	optimizer	SGD
Learning rate	0.01	Epochs	250

参数	设置	参数	设置
Batch_Size	32	workers	4
Image_Size	640×640	optimizer	SGD
Learning rate	0.01	Epochs	250

MSCF	C3k2-DKSM	MSE-Head	MPDIoU	mAP₅₀/%	mAP_50：95/%	Params/10⁶	GFLOPs	FPS
				83.4	49.1	2.5	6.4	157
√				84.3	50.6	2.9	6.7	150
	√			83.8	49.2	2.6	6.3	155
		√		84.9	52.0	2.8	6.6	152
			√	83.5	49.5	2.3	6.2	160
√	√			84.8	51.9	3.2	7.0	138
√	√	√		85.3	52.7	3.6	7.4	125
√	√	√	√	85.0	53.2	3.2	7.6	135

MSCF	C3k2-DKSM	MSE-Head	MPDIoU	mAP₅₀/%	mAP_50：95/%	Params/10⁶	GFLOPs	FPS
				83.4	49.1	2.5	6.4	157
√				84.3	50.6	2.9	6.7	150
	√			83.8	49.2	2.6	6.3	155
		√		84.9	52.0	2.8	6.6	152
			√	83.5	49.5	2.3	6.2	160
√	√			84.8	51.9	3.2	7.0	138
√	√	√		85.3	52.7	3.6	7.4	125
√	√	√	√	85.0	53.2	3.2	7.6	135

算法	mAP₅₀/%	Params/10⁶	GFLOPs	FPS
YOLOv5	82.9	2.1	5.8	140
YOLOv6	82.0	4.1	11.5	96
YOLOv8	83.1	2.6	6.9	145
YOLOv10	82.2	2.7	8.4	130
YOLO11	83.4	2.5	6.4	157
YOLOv3-tiny	81.0	1.8	4.5	150
Faster R-CNN	82.0	12.0	18.0	50
SSD	81.5	4.0	7.0	120
文献［21］算法	83.1	1.7	6.9	114
文献［22］算法	84.1	2.9	8.7	—
CSAF-YOLO	85.0	3.2	7.6	135

CSAF-YOLO： improved YOLO11 algorithm for underwater small object detection

基于YOLO11改进的水下小目标检测算法CSAF-YOLO

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 18

References 35

Related Articles 15

Recommended Articles

Metrics

注意力机制	mAP₅₀/%	Params/10⁶	GFLOPs	FPS
CCFM	82.6	1.8	5.4	142
SE	83.5	3.1	7.5	136
EMA	83.2	2.5	6.4	138
CBAM	84.2	3.3	7.8	133
MSCF	84.3	2.9	6.7	150

C3k2模块	mAP₅₀/%	Params/10⁶	GFLOPs	FPS
C3k2-Standard	82.0	2.3	6.0	158.0
C3k2-WTConv	83.4	2.4	6.2	156.0
C3k2-MAB	81.8	2.4	6.5	154.0
C3k2-BiFPN	82.5	2.5	6.4	155.5
C3k2-DKSM	83.8	2.6	6.3	155.0

检测头	mAP₅₀/%	Params/10⁶	GFLOPs	FPS
RetinaNet-Head	83.0	3.2	7.1	140
CenterNet-Head	83.2	3.0	6.8	145
AFPN-Head	84.2	2.9	6.9	148
Dynamic-Head	84.5	2.7	6.7	150
MSE-Head	84.9	2.8	6.6	152

实验序号	MSCF空间卷积核组合	DKSM全局调制维度	MSE-Head降维比例	mAP₅₀/%	Params/10⁶	GFLOPs
1	｛3×3｝	3	1/4	84.0	2.72	6.6
2	｛3×3，5×5｝	3	1/4	84.2	2.81	6.8
3	｛3×3，5×5，7×7｝	3	1/4	84.3	2.90	6.9
4	｛3×3，5×5，7×7，9×9｝	3	1/4	84.1	3.42	7.3
5	｛3×3，5×5，7×7｝	6	1/4	84.0	3.01	7.1
6	｛3×3，5×5，7×7｝	3	1/8	84.1	2.65	6.5
7	｛3×3，5×5，7×7｝	3	1/2	84.3	3.28	7.5

[1]	Huijie GUO, Tianfeng DOU, Zhenlin ZHANG, Kaiyuan QI, Dong WU, Zhijian QU, Zhao LI, Chongguang REN. Time-interdependency-aware dynamic Bayesian network for traffic prediction [J]. Journal of Computer Applications, 2026, 46(5): 1507-1517.
[2]	Qianfei WANG, Yang LI, Deyu LI, Suge WANG. Dual-channel feature fusion representation method for short-text clustering based on large language model [J]. Journal of Computer Applications, 2026, 46(5): 1441-1449.
[3]	Baoyuan ZHENG, Chaobo HE. Graph convolutional network enhanced by graph diffusion and dual-view feature learning [J]. Journal of Computer Applications, 2026, 46(5): 1370-1377.
[4]	Ruirui SONG, Leichun WANG, Yunping HE, Jinxiang WEI, Xiangfeng LU, Xiaomeng LIU. Long time series prediction based on hybrid self-attention and differentiated normalization [J]. Journal of Computer Applications, 2026, 46(5): 1499-1506.
[5]	Xinyi YAN, Linglong ZHU, Yonghong ZHANG. CDC-DETR： multi-scale real-time human-vehicle detection method for complex traffic scenarios [J]. Journal of Computer Applications, 2026, 46(4): 1283-1291.
[6]	Xumeng DOU, Bin XIE, Zhaohui ZHANG, Zhengang ZHAO, Hanyu DUAN, Aolei GUO. Drug-target interaction prediction based on structure-network collaborative features and grid-attention enhanced Kolmogorov-Arnold network [J]. Journal of Computer Applications, 2026, 46(4): 1344-1353.
[7]	Huanxian LIU, Hongtao WANG, Xian’ao WANG, Hongmei WANG, Weifeng XU. Multimodal fact verification with cross-modal semantic association [J]. Journal of Computer Applications, 2026, 46(4): 1069-1076.
[8]	Chuandong QIN, Zhiqiang SUO. Skin cancer classification integrating improved ResNet50 with ensemble classifier [J]. Journal of Computer Applications, 2026, 46(4): 1354-1362.
[9]	Xiang BAI, Juchuan LI, Huimin WANG, Chao JING, Jian NIU, Xingzhong ZHANG, Yongqiang CHENG. Power image retrieval method based on improved Swin Transformer [J]. Journal of Computer Applications, 2026, 46(4): 1334-1343.
[10]	Peirong SHAO, Suzhen LIN, Yanbo WANG. Human-centric detail-enhanced virtual try-on method [J]. Journal of Computer Applications, 2026, 46(3): 915-923.
[11]	Hanqing LIU, Guoming SANG, Yijia ZHANG. Remote sensing image captioning model combining dense multi-scale feature fusion and feature knowledge-enhanced Transformer [J]. Journal of Computer Applications, 2026, 46(3): 741-749.
[12]	Zuxi ZHANG, Zhancheng ZHANG, Fuyuan HU. Local and long-range temporal complementary modeling for video action recognition [J]. Journal of Computer Applications, 2026, 46(3): 758-766.
[13]	Junrui WU, Jiangchuan YANG, Haisheng YU, Sai ZOU, Wenyong WANG. Performance evaluation method for deterministic networks based on complex-enhanced attention graph neural network [J]. Journal of Computer Applications, 2026, 46(2): 505-517.
[14]	Hu LUO, Mingshu ZHANG. Rumor detection method based on cross-modal attention mechanism and contrastive learning [J]. Journal of Computer Applications, 2026, 46(2): 361-367.
[15]	Rifeng ZHANG, Guangming LI, Yurong OUYANG. Low-light image enhancement network guided by reflection prior map [J]. Journal of Computer Applications, 2026, 46(2): 546-554.

算法	小目标	中目标	大目标
YOLO11	75.4	89.1	94.2
CSAF-YOLO	79.6	91.4	94.7

算法	小目标	中目标	大目标
YOLO11	75.4	89.1	94.2
CSAF-YOLO	79.6	91.4	94.7