Journal of Computer Applications ›› 2024, Vol. 44 ›› Issue (11): 3610-3616.DOI: 10.11772/j.issn.1001-9081.2023111550
• Multimedia computing and computer simulation • Previous Articles Next Articles
Dahai LI, Bingtao LI(), Zhendong WANG
Received:
2023-11-23
Revised:
2024-03-26
Accepted:
2024-04-10
Online:
2024-04-12
Published:
2024-11-10
Contact:
Bingtao LI
About author:
LI Dahai, born in 1975, Ph. D., associate professor. His research interests include deep learning, reinforcement learning, intelligent optimization algorithms.Supported by:
通讯作者:
李冰涛
作者简介:
李大海(1975—),男,山东乳山人,副教授,博士,CCF会员,主要研究方向:深度学习、强化学习、智能优化算法基金资助:
CLC Number:
Dahai LI, Bingtao LI, Zhendong WANG. Underwater target detection algorithm based on improved YOLOv8[J]. Journal of Computer Applications, 2024, 44(11): 3610-3616.
李大海, 李冰涛, 王振东. 基于改进YOLOv8的水下目标检测算法[J]. 《计算机应用》唯一官方网站, 2024, 44(11): 3610-3616.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2023111550
注意力 机制 | RUOD | UPRC | S/MB | ||
---|---|---|---|---|---|
mAP0.5/% | v/(frame·s-1) | mAP0.5/% | v/(frame·s-1) | ||
SE[ | 69.3 | 51 | 86.3 | 42 | 13.5 |
CBAM[ | 69.8 | 53 | 86.9 | 43 | 13.9 |
LKCA[ | 71.1 | 45 | 87.0 | 38 | 20.6 |
CA[ | 70.8 | 52 | 86.8 | 45 | 14.1 |
FCA | 71.4 | 54 | 87.2 | 46 | 14.4 |
Tab. 1 Comparison of detection results of different attention mechanisms introduced by YOLOv8s
注意力 机制 | RUOD | UPRC | S/MB | ||
---|---|---|---|---|---|
mAP0.5/% | v/(frame·s-1) | mAP0.5/% | v/(frame·s-1) | ||
SE[ | 69.3 | 51 | 86.3 | 42 | 13.5 |
CBAM[ | 69.8 | 53 | 86.9 | 43 | 13.9 |
LKCA[ | 71.1 | 45 | 87.0 | 38 | 20.6 |
CA[ | 70.8 | 52 | 86.8 | 45 | 14.1 |
FCA | 71.4 | 54 | 87.2 | 46 | 14.4 |
损失函数 | RUOD | UPRC | ||
---|---|---|---|---|
mAP0.5/% | v/(frame·s-1) | mAP0.5/% | v/(frame·s-1) | |
CIoU[ | 68.9 | 58 | 85.7 | 52 |
EIoU[ | 69.2 | 54 | 85.8 | 51 |
GIoU[ | 70.3 | 53 | 85.8 | 50 |
WIoU v1 | 71.6 | 54 | 86.1 | 52 |
WIoU v2 | 71.9 | 53 | 86.4 | 53 |
WIoU v3 | 72.4 | 53 | 86.7 | 53 |
Tab. 2 Comparison of detection results of different loss functions used by YOLOv8s
损失函数 | RUOD | UPRC | ||
---|---|---|---|---|
mAP0.5/% | v/(frame·s-1) | mAP0.5/% | v/(frame·s-1) | |
CIoU[ | 68.9 | 58 | 85.7 | 52 |
EIoU[ | 69.2 | 54 | 85.8 | 51 |
GIoU[ | 70.3 | 53 | 85.8 | 50 |
WIoU v1 | 71.6 | 54 | 86.1 | 52 |
WIoU v2 | 71.9 | 53 | 86.4 | 53 |
WIoU v3 | 72.4 | 53 | 86.7 | 53 |
模型 | RUOD | UPRC | S/MB | ||
---|---|---|---|---|---|
mAP0.5/% | v/(frame·s-1) | mAP0.5/% | v/(frame·s-1) | ||
Faster R‑CNN[ | 58.6 | 16 | 64.4 | 15 | 42.0 |
SSD300[ | 66.4 | 58 | 71.3 | 49 | 28.0 |
DETR-DC5[ | 60.8 | 42 | 69.7 | 31 | 41.0 |
YOLOv5s | 66.8 | 54 | 83.2 | 46 | 7.2 |
YOLOv7 | 67.6 | 55 | 84.6 | 48 | 14.4 |
PANet[ | 61.4 | 62 | 80.5 | 48 | 14.6 |
RoIMix[ | 70.2 | 42 | 84.2 | 45 | 17.4 |
LKCA-YOLOv5[ | 72.1 | 48 | 87.3 | 33 | 20.6 |
YOLOv8s | 68.9 | 58 | 85.7 | 52 | 12.3 |
WCA-YOLOv8 | 75.8 | 60 | 88.6 | 57 | 15.9 |
Tab. 3 Comparison of experimental results of different models
模型 | RUOD | UPRC | S/MB | ||
---|---|---|---|---|---|
mAP0.5/% | v/(frame·s-1) | mAP0.5/% | v/(frame·s-1) | ||
Faster R‑CNN[ | 58.6 | 16 | 64.4 | 15 | 42.0 |
SSD300[ | 66.4 | 58 | 71.3 | 49 | 28.0 |
DETR-DC5[ | 60.8 | 42 | 69.7 | 31 | 41.0 |
YOLOv5s | 66.8 | 54 | 83.2 | 46 | 7.2 |
YOLOv7 | 67.6 | 55 | 84.6 | 48 | 14.4 |
PANet[ | 61.4 | 62 | 80.5 | 48 | 14.6 |
RoIMix[ | 70.2 | 42 | 84.2 | 45 | 17.4 |
LKCA-YOLOv5[ | 72.1 | 48 | 87.3 | 33 | 20.6 |
YOLOv8s | 68.9 | 58 | 85.7 | 52 | 12.3 |
WCA-YOLOv8 | 75.8 | 60 | 88.6 | 57 | 15.9 |
模型结构 | S/MB | T/min | C/GFLOPs | v/(frame·s-1) | mAP0.5/% | |||
---|---|---|---|---|---|---|---|---|
D | F | L | M | |||||
12.3 | 157 | 16.7 | 58 | 68.9 | ||||
√ | 9.5 | 139 | 13.2 | 67 | 70.1 | |||
√ | 14.4 | 169 | 18.3 | 54 | 71.4 | |||
√ | 14.2 | 166 | 18.7 | 53 | 72.4 | |||
√ | 13.2 | 162 | 17.9 | 56 | 71.6 | |||
√ | √ | 11.3 | 151 | 15.2 | 63 | 72.5 | ||
√ | √ | √ | 13.8 | 154 | 15.7 | 62 | 74.7 | |
√ | √ | √ | √ | 15.9 | 158 | 16.4 | 60 | 75.8 |
Tab. 4 Results of WCA-YOLOv8 ablation experiments
模型结构 | S/MB | T/min | C/GFLOPs | v/(frame·s-1) | mAP0.5/% | |||
---|---|---|---|---|---|---|---|---|
D | F | L | M | |||||
12.3 | 157 | 16.7 | 58 | 68.9 | ||||
√ | 9.5 | 139 | 13.2 | 67 | 70.1 | |||
√ | 14.4 | 169 | 18.3 | 54 | 71.4 | |||
√ | 14.2 | 166 | 18.7 | 53 | 72.4 | |||
√ | 13.2 | 162 | 17.9 | 56 | 71.6 | |||
√ | √ | 11.3 | 151 | 15.2 | 63 | 72.5 | ||
√ | √ | √ | 13.8 | 154 | 15.7 | 62 | 74.7 | |
√ | √ | √ | √ | 15.9 | 158 | 16.4 | 60 | 75.8 |
1 | XU S, ZHANG M, SONG W, et al. A systematic review and analysis of deep learning-based underwater object detection[J]. Neurocomputing, 2023, 527: 204-232. |
2 | GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]// Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2014: 580-587. |
3 | GIRSHICK R. Fast R-CNN[C]// Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2015: 1440-1448. |
4 | REN S, HE K, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[C]// Proceedings of the 28th International Conference on Neural Information Processing Systems — Volume 1. Cambridge: MIT Press, 2015:91-99. |
5 | LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot MultiBox detector[C]// Proceedings of the 2016 European Conference on Computer Vision, LNCS 9905. Cham: Springer, 2016: 21-37. |
6 | REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 779-788. |
7 | 陶洋,赵文博,钟邦乾,等.融合大卷积核注意力机制的水下目标检测算法[J/OL].小型微型计算机系统,2023 [2024-04-10].. |
TAO Y, ZHAO W B, ZHONG B Q, et al. Underwater target detection algorithm with large kernel convolutional attention mechanism[J/OL]. Journal of Chinese Computer Systems, 2023 [2024-04-10].. | |
8 | BAO Z, GUO Y, WANG J, et al. Underwater target detection based on parallel high-resolution networks[J]. Sensors, 2023, 23(17): No.7337. |
9 | 陈宇梁,董绍江,朱孙科,等.改进的YOLOv3浅海水下生物目标检测[J].计算机工程与应用,2023,59(18):190-197. |
CHEN Y L, DONG S J, ZHU S K, et al. Improved YOLOv3 shallow sea underwater biological target detection[J]. Computer Engineering and Applications, 2023, 59(18):190-197. | |
10 | 刘萍,杨鸿波,宋阳.改进YOLOv3网络的海洋生物识别算法[J]. 计算机应用研究,2020,37(S1):394-397. |
LIU P, YANG H B, SONG Y. Improved YOLOv3 network based marine biometric recognition algorithm[J]. Application Research of Computers, 2020, 37(S1):394-397. | |
11 | SANDLER M, HOWARD A, ZHU M, et al. MobileNetV2: inverted residuals and linear bottlenecks[C]// Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 4510-4520. |
12 | HOU Q, ZHOU D, FENG J. Coordinate attention for efficient mobile network design[C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 13708-13717. |
13 | MA N, ZHANG X, SUN J. Funnel activation for visual recognition[C]// Proceedings of the 2020 European Conference on Computer Vision, LNCS 12356. Cham: Springer, 2020: 351-368. |
14 | ZHANG Y F, REN W, ZHANG Z, et al. Focal and efficient IoU loss for accurate bounding box regression[J]. Neurocomputing, 2022, 506: 146-157. |
15 | LI X, WANG W, WU L, et al. Generalized focal loss: learning qualified and distributed bounding boxes for dense object detection[C]// Proceedings of the 34th International Conference on Neural Information Processing Systems. Red Hook, NY: Curran Associates Inc., 2020: 21002-21012. |
16 | ZHENG Z, WANG P, LIU W, et al. Distance-IoU loss: faster and better learning for bounding box regression[C]// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2020: 12993-13000. |
17 | TONG Z, CHEN Y, XU Z, et al. Wise-IoU: bounding box regression loss with dynamic focusing mechanism[EB/OL]. (2023-04-08) [2024-04-10]. |
18 | REZATOFIGHI H, TSOI N, GWAK J, et al. Generalized intersection over union: a metric and a loss for bounding box regression[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 658-666. |
19 | ZENG L, SUN B, ZHU D. Underwater target detection based on Faster R-CNN and adversarial occlusion network[J]. Engineering Applications of Artificial Intelligence, 2021, 100: No.104190. |
20 | CARION N, MASSA F, SYNNAEVE G, et al. End-to-end object detection with Transformers[C]// Proceedings of the 2020 European Conference on Computer Vision, LNCS 12346. Cham: Springer, 2020: 213-229. |
21 | LEI F, TANG F, LI S. Underwater target detection algorithm based on improved YOLOv5[J]. Journal of Marine Science and Engineering, 2022, 10(3): No.310. |
22 | LIN W H, ZHONG J X, LIU S, et al. RoIMix: proposal-fusion among multiple images for underwater object detection[C]// Proceedings of the 2020 IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway: IEEE, 2020: 2588-2592. |
23 | STERGIOU A, POPPE R, KALLIATAKIS G. Refining activation downsampling with SoftPool[C]// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 10337-10346. |
24 | HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 7132-7141. |
25 | WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]// Proceedings of the 2018 European Conference on Computer Vision, LNCS 11211. Cham: Springer, 2018: 3-19. |
[1] | Yexin PAN, Zhe YANG. Optimization model for small object detection based on multi-level feature bidirectional fusion [J]. Journal of Computer Applications, 2024, 44(9): 2871-2877. |
[2] | Ruihua LIU, Zihe HAO, Yangyang ZOU. Gait recognition algorithm based on multi-layer refined feature fusion [J]. Journal of Computer Applications, 2024, 44(7): 2250-2257. |
[3] | Mengyuan HUANG, Kan CHANG, Mingyang LING, Xinjie WEI, Tuanfa QIN. Progressive enhancement algorithm for low-light images based on layer guidance [J]. Journal of Computer Applications, 2024, 44(6): 1911-1919. |
[4] | Yue LIU, Fang LIU, Aoyun WU, Qiuyue CHAI, Tianxiao WANG. 3D object detection network based on self-attention mechanism and graph convolution [J]. Journal of Computer Applications, 2024, 44(6): 1972-1977. |
[5] | Huantong GENG, Zhenyu LIU, Jun JIANG, Zichen FAN, Jiaxing LI. Embedded road crack detection algorithm based on improved YOLOv8 [J]. Journal of Computer Applications, 2024, 44(5): 1613-1618. |
[6] | Xin LI, Qiao MENG, Junyi HUANGFU, Lingchen MENG. YOLOv5 multi-attribute classification based on separable label collaborative learning [J]. Journal of Computer Applications, 2024, 44(5): 1619-1628. |
[7] | Hongtian LI, Xinhao SHI, Weiguo PAN, Cheng XU, Bingxin XU, Jiazheng YUAN. Few-shot object detection via fusing multi-scale and attention mechanism [J]. Journal of Computer Applications, 2024, 44(5): 1437-1444. |
[8] | Guijin HAN, Xinyuan ZHANG, Wentao ZHANG, Ya HUANG. Self-supervised image registration algorithm based on multi-feature fusion [J]. Journal of Computer Applications, 2024, 44(5): 1597-1604. |
[9] | Zhanjun JIANG, Baijing WU, Long MA, Jing LIAN. Faster-RCNN water-floating garbage recognition based on multi-scale feature and polarized self-attention [J]. Journal of Computer Applications, 2024, 44(3): 938-944. |
[10] | Xinye LI, Yening HOU, Yinghui KONG, Zhiqi YAN. Few-shot object detection combining feature fusion and enhanced attention [J]. Journal of Computer Applications, 2024, 44(3): 745-751. |
[11] | Ning WU, Yangyang LUO, Huajie XU. Semantic segmentation method for remote sensing images based on multi-scale feature fusion [J]. Journal of Computer Applications, 2024, 44(3): 737-744. |
[12] | Yuliang ZHENG, Yunhua CHEN, Weijie BAI, Pinghua CHEN. Vehicle target detection by fusing event data and image frames [J]. Journal of Computer Applications, 2024, 44(3): 931-937. |
[13] | Zongze JIA, Pengfei GAO, Yinglong MA, Xiaofeng LIU, Haixin XIA. Multi-feature fusion attention-based hierarchical classification method for dialogue act [J]. Journal of Computer Applications, 2024, 44(3): 715-721. |
[14] | Ziqi HUANG, Jianpeng HU. Entity category enhanced nested named entity recognition in automotive domain [J]. Journal of Computer Applications, 2024, 44(2): 377-384. |
[15] | Qiaoling HUANG, Bochuan ZHENG, Zicheng DING, Zedong WU. Improved image inpainting network incorporating supervised attention module and cross-stage feature fusion [J]. Journal of Computer Applications, 2024, 44(2): 572-579. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||