Journal of Computer Applications ›› 2026, Vol. 46 ›› Issue (5): 1578-1585.DOI: 10.11772/j.issn.1001-9081.2025101310
• Multimedia computing and computer simulation • Previous Articles
Hongrui ZHANG1,2, Weiming FENG1,2, Luxia YANG1,2(
), Yongjie MA3
Received:2025-11-10
Revised:2025-12-25
Accepted:2026-01-04
Online:2026-01-08
Published:2026-05-10
Contact:
Luxia YANG
About author:ZHANG Hongrui, born in 1992, Ph. D., lecturer. Her research interests include machine vision tasks in intelligent transportation systems.Supported by:
张红瑞1,2, 冯威铭1,2, 杨潞霞1,2(
), 马永杰3
通讯作者:
杨潞霞
作者简介:张红瑞(1992—),女,山西孝义人,讲师,博士,主要研究方向:智能交通系统中的机器视觉任务CLC Number:
Hongrui ZHANG, Weiming FENG, Luxia YANG, Yongjie MA. CSAF-YOLO: improved YOLO11 algorithm for underwater small object detection[J]. Journal of Computer Applications, 2026, 46(5): 1578-1585.
张红瑞, 冯威铭, 杨潞霞, 马永杰. 基于YOLO11改进的水下小目标检测算法CSAF-YOLO[J]. 《计算机应用》唯一官方网站, 2026, 46(5): 1578-1585.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2025101310
| 参数 | 设置 | 参数 | 设置 |
|---|---|---|---|
| Batch_Size | 32 | workers | 4 |
| Image_Size | 640×640 | optimizer | SGD |
| Learning rate | 0.01 | Epochs | 250 |
Tab. 1 Training parameter settings
| 参数 | 设置 | 参数 | 设置 |
|---|---|---|---|
| Batch_Size | 32 | workers | 4 |
| Image_Size | 640×640 | optimizer | SGD |
| Learning rate | 0.01 | Epochs | 250 |
| MSCF | C3k2-DKSM | MSE-Head | MPDIoU | mAP50/% | mAP50:95/% | Params/106 | GFLOPs | FPS |
|---|---|---|---|---|---|---|---|---|
| 83.4 | 49.1 | 2.5 | 6.4 | 157 | ||||
| √ | 84.3 | 50.6 | 2.9 | 6.7 | 150 | |||
| √ | 83.8 | 49.2 | 2.6 | 6.3 | 155 | |||
| √ | 84.9 | 52.0 | 2.8 | 6.6 | 152 | |||
| √ | 83.5 | 49.5 | 2.3 | 6.2 | 160 | |||
| √ | √ | 84.8 | 51.9 | 3.2 | 7.0 | 138 | ||
| √ | √ | √ | 85.3 | 52.7 | 3.6 | 7.4 | 125 | |
| √ | √ | √ | √ | 85.0 | 53.2 | 3.2 | 7.6 | 135 |
Tab. 2 Ablation experiment results of CSAF-YOLO model
| MSCF | C3k2-DKSM | MSE-Head | MPDIoU | mAP50/% | mAP50:95/% | Params/106 | GFLOPs | FPS |
|---|---|---|---|---|---|---|---|---|
| 83.4 | 49.1 | 2.5 | 6.4 | 157 | ||||
| √ | 84.3 | 50.6 | 2.9 | 6.7 | 150 | |||
| √ | 83.8 | 49.2 | 2.6 | 6.3 | 155 | |||
| √ | 84.9 | 52.0 | 2.8 | 6.6 | 152 | |||
| √ | 83.5 | 49.5 | 2.3 | 6.2 | 160 | |||
| √ | √ | 84.8 | 51.9 | 3.2 | 7.0 | 138 | ||
| √ | √ | √ | 85.3 | 52.7 | 3.6 | 7.4 | 125 | |
| √ | √ | √ | √ | 85.0 | 53.2 | 3.2 | 7.6 | 135 |
| 算法 | mAP50/% | Params/106 | GFLOPs | FPS |
|---|---|---|---|---|
| YOLOv5 | 82.9 | 2.1 | 5.8 | 140 |
| YOLOv6 | 82.0 | 4.1 | 11.5 | 96 |
| YOLOv8 | 83.1 | 2.6 | 6.9 | 145 |
| YOLOv10 | 82.2 | 2.7 | 8.4 | 130 |
| YOLO11 | 83.4 | 2.5 | 6.4 | 157 |
| YOLOv3-tiny | 81.0 | 1.8 | 4.5 | 150 |
| Faster R-CNN | 82.0 | 12.0 | 18.0 | 50 |
| SSD | 81.5 | 4.0 | 7.0 | 120 |
| 文献[ | 83.1 | 1.7 | 6.9 | 114 |
| 文献[ | 84.1 | 2.9 | 8.7 | — |
| CSAF-YOLO | 85.0 | 3.2 | 7.6 | 135 |
Tab. 3 Performance comparison of CSAF-YOLO with other algorithms
| 算法 | mAP50/% | Params/106 | GFLOPs | FPS |
|---|---|---|---|---|
| YOLOv5 | 82.9 | 2.1 | 5.8 | 140 |
| YOLOv6 | 82.0 | 4.1 | 11.5 | 96 |
| YOLOv8 | 83.1 | 2.6 | 6.9 | 145 |
| YOLOv10 | 82.2 | 2.7 | 8.4 | 130 |
| YOLO11 | 83.4 | 2.5 | 6.4 | 157 |
| YOLOv3-tiny | 81.0 | 1.8 | 4.5 | 150 |
| Faster R-CNN | 82.0 | 12.0 | 18.0 | 50 |
| SSD | 81.5 | 4.0 | 7.0 | 120 |
| 文献[ | 83.1 | 1.7 | 6.9 | 114 |
| 文献[ | 84.1 | 2.9 | 8.7 | — |
| CSAF-YOLO | 85.0 | 3.2 | 7.6 | 135 |
| 注意力机制 | mAP50/% | Params/106 | GFLOPs | FPS |
|---|---|---|---|---|
| CCFM | 82.6 | 1.8 | 5.4 | 142 |
| SE | 83.5 | 3.1 | 7.5 | 136 |
| EMA | 83.2 | 2.5 | 6.4 | 138 |
| CBAM | 84.2 | 3.3 | 7.8 | 133 |
| MSCF | 84.3 | 2.9 | 6.7 | 150 |
Tab. 4 Performance comparison of different attention mechanisms
| 注意力机制 | mAP50/% | Params/106 | GFLOPs | FPS |
|---|---|---|---|---|
| CCFM | 82.6 | 1.8 | 5.4 | 142 |
| SE | 83.5 | 3.1 | 7.5 | 136 |
| EMA | 83.2 | 2.5 | 6.4 | 138 |
| CBAM | 84.2 | 3.3 | 7.8 | 133 |
| MSCF | 84.3 | 2.9 | 6.7 | 150 |
| C3k2模块 | mAP50/% | Params/106 | GFLOPs | FPS |
|---|---|---|---|---|
| C3k2-Standard | 82.0 | 2.3 | 6.0 | 158.0 |
| C3k2-WTConv | 83.4 | 2.4 | 6.2 | 156.0 |
| C3k2-MAB | 81.8 | 2.4 | 6.5 | 154.0 |
| C3k2-BiFPN | 82.5 | 2.5 | 6.4 | 155.5 |
| C3k2-DKSM | 83.8 | 2.6 | 6.3 | 155.0 |
Tab. 5 Performance comparison of different C3k2 modules
| C3k2模块 | mAP50/% | Params/106 | GFLOPs | FPS |
|---|---|---|---|---|
| C3k2-Standard | 82.0 | 2.3 | 6.0 | 158.0 |
| C3k2-WTConv | 83.4 | 2.4 | 6.2 | 156.0 |
| C3k2-MAB | 81.8 | 2.4 | 6.5 | 154.0 |
| C3k2-BiFPN | 82.5 | 2.5 | 6.4 | 155.5 |
| C3k2-DKSM | 83.8 | 2.6 | 6.3 | 155.0 |
| 检测头 | mAP50/% | Params/106 | GFLOPs | FPS |
|---|---|---|---|---|
| RetinaNet-Head | 83.0 | 3.2 | 7.1 | 140 |
| CenterNet-Head | 83.2 | 3.0 | 6.8 | 145 |
| AFPN-Head | 84.2 | 2.9 | 6.9 | 148 |
| Dynamic-Head | 84.5 | 2.7 | 6.7 | 150 |
| MSE-Head | 84.9 | 2.8 | 6.6 | 152 |
Tab. 6 Performance comparison of different detection heads
| 检测头 | mAP50/% | Params/106 | GFLOPs | FPS |
|---|---|---|---|---|
| RetinaNet-Head | 83.0 | 3.2 | 7.1 | 140 |
| CenterNet-Head | 83.2 | 3.0 | 6.8 | 145 |
| AFPN-Head | 84.2 | 2.9 | 6.9 | 148 |
| Dynamic-Head | 84.5 | 2.7 | 6.7 | 150 |
| MSE-Head | 84.9 | 2.8 | 6.6 | 152 |
| 实验序号 | MSCF空间卷积核组合 | DKSM全局调制维度 | MSE-Head降维比例 | mAP50/% | Params/106 | GFLOPs |
|---|---|---|---|---|---|---|
| 1 | {3×3} | 3 | 1/4 | 84.0 | 2.72 | 6.6 |
| 2 | {3×3,5×5} | 3 | 1/4 | 84.2 | 2.81 | 6.8 |
| 3 | {3×3,5×5,7×7} | 3 | 1/4 | 84.3 | 2.90 | 6.9 |
| 4 | {3×3,5×5,7×7,9×9} | 3 | 1/4 | 84.1 | 3.42 | 7.3 |
| 5 | {3×3,5×5,7×7} | 6 | 1/4 | 84.0 | 3.01 | 7.1 |
| 6 | {3×3,5×5,7×7} | 3 | 1/8 | 84.1 | 2.65 | 6.5 |
| 7 | {3×3,5×5,7×7} | 3 | 1/2 | 84.3 | 3.28 | 7.5 |
Tab. 7 Experimental results of key hyperparameter selection
| 实验序号 | MSCF空间卷积核组合 | DKSM全局调制维度 | MSE-Head降维比例 | mAP50/% | Params/106 | GFLOPs |
|---|---|---|---|---|---|---|
| 1 | {3×3} | 3 | 1/4 | 84.0 | 2.72 | 6.6 |
| 2 | {3×3,5×5} | 3 | 1/4 | 84.2 | 2.81 | 6.8 |
| 3 | {3×3,5×5,7×7} | 3 | 1/4 | 84.3 | 2.90 | 6.9 |
| 4 | {3×3,5×5,7×7,9×9} | 3 | 1/4 | 84.1 | 3.42 | 7.3 |
| 5 | {3×3,5×5,7×7} | 6 | 1/4 | 84.0 | 3.01 | 7.1 |
| 6 | {3×3,5×5,7×7} | 3 | 1/8 | 84.1 | 2.65 | 6.5 |
| 7 | {3×3,5×5,7×7} | 3 | 1/2 | 84.3 | 3.28 | 7.5 |
| 算法 | 小目标 | 中目标 | 大目标 |
|---|---|---|---|
| YOLO11 | 75.4 | 89.1 | 94.2 |
| CSAF-YOLO | 79.6 | 91.4 | 94.7 |
Tab. 8 mAP50 improvement across different object sizes
| 算法 | 小目标 | 中目标 | 大目标 |
|---|---|---|---|
| YOLO11 | 75.4 | 89.1 | 94.2 |
| CSAF-YOLO | 79.6 | 91.4 | 94.7 |
| [1] | JOSHI R, USMANI K, KRISHNAN G, et al. Underwater object detection and temporal signal detection in turbid water using 3D‑integral imaging and deep learning[J]. Optics Express, 2024, 32(2): 1789-1801. |
| [2] | CHOI J Y, HAN J M. Deep learning (Fast R-CNN)-based evaluation of rail surface defects[J]. Applied Sciences, 2024, 14(5): No.1874. |
| [3] | XU X, ZHAO M, SHI P, et al. Crack detection and comparison study based on Faster R-CNN and Mask R-CNN[J]. Sensors, 2022, 22(3): No.1215. |
| [4] | ZHAI S, SHANG D, WANG S, et al. DF-SSD: an improved SSD object detection algorithm based on DenseNet and feature fusion[J]. IEEE Access, 2020, 8: 24344-24357. |
| [5] | CHEN L, ZHOU Y, XU S. ERetinaNet: an efficient neural network based on RetinaNet for mammographic breast mass detection[J]. IEEE Journal of Biomedical and Health Informatics, 2024, 28(5): 2866-2878. |
| [6] | LIU K, PENG L, TANG S. Underwater object detection using TC‑YOLO with attention mechanisms[J]. Sensors, 2023, 23(5): No.2567. |
| [7] | SUN Y, ZHENG W, DU X, et al. Underwater small target detection based on YOLOX combined with MobileViT and double coordinate attention[J]. Journal of Marine Science and Engineering, 2023, 11(6): No.1178. |
| [8] | GE H, DAI Y, ZHU Z, et al. Single-stage underwater target detection based on feature anchor frame double optimization network[J]. Sensors, 2022, 22(20): No.7875. |
| [9] | HUA X, CUI X, XU X, et al. Underwater object detection algorithm based on feature enhancement and progressive dynamic aggregation strategy[J]. Pattern Recognition, 2023, 139: No.109511. |
| [10] | MI Y, CHI M, ZHANG Q, et al. Research on multi-scale fusion image enhancement and improved YOLOv5s lightweight ROV underwater target detection method[J]. Scientific Reports, 2024, 14: No.28280. |
| [11] | CAI S, ZHANG X, MO Y. A lightweight underwater detector enhanced by attention mechanism, GSConv and WIoU on YOLOv8[J]. Scientific Reports, 2024, 14: No.25797. |
| [12] | MA S, XU Y. MPDIoU: a loss for efficient and accurate bounding box regression[EB/OL]. [2025-07-16].. |
| [13] | HUANG J, WANG K, HOU Y, et al. LW-YOLO11: a lightweight arbitrary-oriented ship detection method based on improved YOLO11[J]. Sensors, 2025, 25(1): No.65. |
| [14] | HIDAYATULLAH P, SYAKRANI N, SHOLAHUDDIN M R, et al. YOLOv8 to YOLO11: a comprehensive architecture in-depth comparative review[EB/OL]. [2025-09-07].. |
| [15] | LI T, GANG Y, LI S, et al. A small underwater object detection model with enhanced feature extraction and fusion[J]. Scientific Reports, 2025, 15: No.2396. |
| [16] | ZHOU K, JIANG S. Forest fire detection algorithm based on improved YOLOv11n[J]. Sensors, 2025, 25(10): No.2989. |
| [17] | WU H, NI N, ZHANG L. Learning dynamic scale awareness and global implicit functions for continuous-scale super-resolution of remote sensing images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2023, 61: No.5602315. |
| [18] | 刘雄彪,杨宪昭,陈洋,等.基于CIoU改进边界框损失函数的目标检测方法[J].液晶与显示,2023,38(5):656-665. |
| LIU X B, YANG X Z, CHEN Y, et al. Object detection method based on CIoU improved bounding box loss function[J]. Chinese Journal of Liquid Crystals and Displays, 2023, 38(5): 656-665. | |
| [19] | DU S, ZHANG B, ZHANG P, et al. An improved bounding box regression loss function based on CIoU loss for multi-scale object detection[C]// Proceedings of the IEEE 2nd International Conference on Pattern Recognition and Machine Learning. Piscataway: IEEE, 2021: 92-98. |
| [20] | HU Z, CHENG L, YU S, et al. Underwater target detection with high accuracy and speed based on YOLOv10[J]. Journal of Marine Science and Engineering, 2025, 13(1): No.135. |
| [21] | 梁秀满,赵佳阳,于海峰.基于YOLOv8的轻量化水下目标检测算法[J].红外技术,2024,46(9):1015-1024. |
| LIANG X M, ZHAO J Y, YU H F. Lightweight underwater target detection algorithm based on YOLOv8[J]. Infrared Technology, 2024, 46(9): 1015-1024. | |
| [22] | 方侦波,高向阳,张锲石,等.基于改进YOLO11的水下目标检测模型[J].电子测量技术,2025,48(15):159-167. |
| FANG Z B, GAO X Y, ZHANG Q S, et al. Underwater object detection model based on improved YOLO11[J]. Electronic Measurement Technology, 2025, 48(15): 159-167. | |
| [23] | ANITHA M, SELVY P T. A multi-context squeeze-excitation framework with explainable attention for cervical spine fracture detection in CT imaging[J/OL]. Iranian Journal of Science and Technology, Transactions of Electrical Engineering, 2025 [2025-11-12]. . |
| [24] | JIANG P, ZHANG J, CHEN J. Enhanced rain removal network with Convolutional Block Attention Module (CBAM): a novel approach to image de-raining[J]. EURASIP Journal on Advances in Signal Processing, 2025, 2025: No.9. |
| [25] | OUYANG D, HE S, ZHANG G, et al. Efficient multi-scale attention module with cross-spatial learning[C]// Proceedings of the 2023 IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway: IEEE, 2023: 1-5. |
| [26] | YANG L, GU Y, FENG H. Multi-scale feature fusion and feature calibration with edge information enhancement for remote sensing object detection[J]. Scientific Reports, 2025, 15: No.15371. |
| [27] | WANG Y, ZHANG J, ZHOU J. Urban traffic tiny object detection via attention and multi-scale feature driven in UAV-vision[J]. Scientific Reports, 2024, 14: No.20614. |
| [28] | WU Y, GENG L, GUO X, et al. An improved YOLOv11n model based on wavelet convolution for object detection in soccer scenes[J]. Symmetry, 2025, 17(10): No.1612. |
| [29] | WU T, XU W, WU Y. A lightweight high-frequency mamba network for image super-resolution[J]. Scientific Reports, 2025, 15: No.25973. |
| [30] | DOHERTY J, GARDINER B, KERR E, et al. BiFPN-YOLO: one-stage object detection integrating bi-directional feature pyramid networks[J]. Pattern Recognition, 2025, 160: No.111209. |
| [31] | MOHAMMED A, IBRAHIM H M, OMAR N M. Optimizing RetinaNet anchors using differential evolution for improved object detection[J]. Scientific Reports, 2025, 15: No.20101. |
| [32] | CHEN J, ER M J. Dynamic YOLO for small underwater object detection[J]. Artificial Intelligence Review, 2024, 57(7): No.165. |
| [33] | WANG H, ZHANG Y, ZHU C. DAFPN-YOLO: an improved UAV-based object detection algorithm based on YOLOv8s[J]. Computers, Materials and Continua, 2025, 83(2): 1929-1949. |
| [34] | DAI X, CHEN Y, XIAO B, et al. Dynamic head: unifying object detection heads with attentions[C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 7373-7382. |
| [35] | SRIVASTAVA S, DIVEKAR A V, ANILKUMAR C, et al. Comparative analysis of deep learning image detection algorithms[J]. Journal of Big Data, 2021, 8: No.66. |
| [1] | Huijie GUO, Tianfeng DOU, Zhenlin ZHANG, Kaiyuan QI, Dong WU, Zhijian QU, Zhao LI, Chongguang REN. Time-interdependency-aware dynamic Bayesian network for traffic prediction [J]. Journal of Computer Applications, 2026, 46(5): 1507-1517. |
| [2] | Qianfei WANG, Yang LI, Deyu LI, Suge WANG. Dual-channel feature fusion representation method for short-text clustering based on large language model [J]. Journal of Computer Applications, 2026, 46(5): 1441-1449. |
| [3] | Baoyuan ZHENG, Chaobo HE. Graph convolutional network enhanced by graph diffusion and dual-view feature learning [J]. Journal of Computer Applications, 2026, 46(5): 1370-1377. |
| [4] | Ruirui SONG, Leichun WANG, Yunping HE, Jinxiang WEI, Xiangfeng LU, Xiaomeng LIU. Long time series prediction based on hybrid self-attention and differentiated normalization [J]. Journal of Computer Applications, 2026, 46(5): 1499-1506. |
| [5] | Xinyi YAN, Linglong ZHU, Yonghong ZHANG. CDC-DETR: multi-scale real-time human-vehicle detection method for complex traffic scenarios [J]. Journal of Computer Applications, 2026, 46(4): 1283-1291. |
| [6] | Xumeng DOU, Bin XIE, Zhaohui ZHANG, Zhengang ZHAO, Hanyu DUAN, Aolei GUO. Drug-target interaction prediction based on structure-network collaborative features and grid-attention enhanced Kolmogorov-Arnold network [J]. Journal of Computer Applications, 2026, 46(4): 1344-1353. |
| [7] | Huanxian LIU, Hongtao WANG, Xian’ao WANG, Hongmei WANG, Weifeng XU. Multimodal fact verification with cross-modal semantic association [J]. Journal of Computer Applications, 2026, 46(4): 1069-1076. |
| [8] | Chuandong QIN, Zhiqiang SUO. Skin cancer classification integrating improved ResNet50 with ensemble classifier [J]. Journal of Computer Applications, 2026, 46(4): 1354-1362. |
| [9] | Xiang BAI, Juchuan LI, Huimin WANG, Chao JING, Jian NIU, Xingzhong ZHANG, Yongqiang CHENG. Power image retrieval method based on improved Swin Transformer [J]. Journal of Computer Applications, 2026, 46(4): 1334-1343. |
| [10] | Peirong SHAO, Suzhen LIN, Yanbo WANG. Human-centric detail-enhanced virtual try-on method [J]. Journal of Computer Applications, 2026, 46(3): 915-923. |
| [11] | Hanqing LIU, Guoming SANG, Yijia ZHANG. Remote sensing image captioning model combining dense multi-scale feature fusion and feature knowledge-enhanced Transformer [J]. Journal of Computer Applications, 2026, 46(3): 741-749. |
| [12] | Zuxi ZHANG, Zhancheng ZHANG, Fuyuan HU. Local and long-range temporal complementary modeling for video action recognition [J]. Journal of Computer Applications, 2026, 46(3): 758-766. |
| [13] | Junrui WU, Jiangchuan YANG, Haisheng YU, Sai ZOU, Wenyong WANG. Performance evaluation method for deterministic networks based on complex-enhanced attention graph neural network [J]. Journal of Computer Applications, 2026, 46(2): 505-517. |
| [14] | Hu LUO, Mingshu ZHANG. Rumor detection method based on cross-modal attention mechanism and contrastive learning [J]. Journal of Computer Applications, 2026, 46(2): 361-367. |
| [15] | Rifeng ZHANG, Guangming LI, Yurong OUYANG. Low-light image enhancement network guided by reflection prior map [J]. Journal of Computer Applications, 2026, 46(2): 546-554. |
| Viewed | ||||||
|
Full text |
|
|||||
|
Abstract |
|
|||||