Journal of Computer Applications ›› 2024, Vol. 44 ›› Issue (11): 3610-3616.DOI: 10.11772/j.issn.1001-9081.2023111550
• Multimedia computing and computer simulation • Previous Articles Next Articles
Dahai LI, Bingtao LI(
), Zhendong WANG
Received:2023-11-23
Revised:2024-03-26
Accepted:2024-04-10
Online:2024-04-12
Published:2024-11-10
Contact:
Bingtao LI
About author:LI Dahai, born in 1975, Ph. D., associate professor. His research interests include deep learning, reinforcement learning, intelligent optimization algorithms.Supported by:通讯作者:
李冰涛
作者简介:李大海(1975—),男,山东乳山人,副教授,博士,CCF会员,主要研究方向:深度学习、强化学习、智能优化算法基金资助:CLC Number:
Dahai LI, Bingtao LI, Zhendong WANG. Underwater target detection algorithm based on improved YOLOv8[J]. Journal of Computer Applications, 2024, 44(11): 3610-3616.
李大海, 李冰涛, 王振东. 基于改进YOLOv8的水下目标检测算法[J]. 《计算机应用》唯一官方网站, 2024, 44(11): 3610-3616.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2023111550
注意力 机制 | RUOD | UPRC | S/MB | ||
|---|---|---|---|---|---|
| mAP0.5/% | v/(frame·s-1) | mAP0.5/% | v/(frame·s-1) | ||
| SE[ | 69.3 | 51 | 86.3 | 42 | 13.5 |
| CBAM[ | 69.8 | 53 | 86.9 | 43 | 13.9 |
| LKCA[ | 71.1 | 45 | 87.0 | 38 | 20.6 |
| CA[ | 70.8 | 52 | 86.8 | 45 | 14.1 |
| FCA | 71.4 | 54 | 87.2 | 46 | 14.4 |
Tab. 1 Comparison of detection results of different attention mechanisms introduced by YOLOv8s
注意力 机制 | RUOD | UPRC | S/MB | ||
|---|---|---|---|---|---|
| mAP0.5/% | v/(frame·s-1) | mAP0.5/% | v/(frame·s-1) | ||
| SE[ | 69.3 | 51 | 86.3 | 42 | 13.5 |
| CBAM[ | 69.8 | 53 | 86.9 | 43 | 13.9 |
| LKCA[ | 71.1 | 45 | 87.0 | 38 | 20.6 |
| CA[ | 70.8 | 52 | 86.8 | 45 | 14.1 |
| FCA | 71.4 | 54 | 87.2 | 46 | 14.4 |
| 损失函数 | RUOD | UPRC | ||
|---|---|---|---|---|
| mAP0.5/% | v/(frame·s-1) | mAP0.5/% | v/(frame·s-1) | |
| CIoU[ | 68.9 | 58 | 85.7 | 52 |
| EIoU[ | 69.2 | 54 | 85.8 | 51 |
| GIoU[ | 70.3 | 53 | 85.8 | 50 |
| WIoU v1 | 71.6 | 54 | 86.1 | 52 |
| WIoU v2 | 71.9 | 53 | 86.4 | 53 |
| WIoU v3 | 72.4 | 53 | 86.7 | 53 |
Tab. 2 Comparison of detection results of different loss functions used by YOLOv8s
| 损失函数 | RUOD | UPRC | ||
|---|---|---|---|---|
| mAP0.5/% | v/(frame·s-1) | mAP0.5/% | v/(frame·s-1) | |
| CIoU[ | 68.9 | 58 | 85.7 | 52 |
| EIoU[ | 69.2 | 54 | 85.8 | 51 |
| GIoU[ | 70.3 | 53 | 85.8 | 50 |
| WIoU v1 | 71.6 | 54 | 86.1 | 52 |
| WIoU v2 | 71.9 | 53 | 86.4 | 53 |
| WIoU v3 | 72.4 | 53 | 86.7 | 53 |
| 模型 | RUOD | UPRC | S/MB | ||
|---|---|---|---|---|---|
| mAP0.5/% | v/(frame·s-1) | mAP0.5/% | v/(frame·s-1) | ||
| Faster R‑CNN[ | 58.6 | 16 | 64.4 | 15 | 42.0 |
| SSD300[ | 66.4 | 58 | 71.3 | 49 | 28.0 |
| DETR-DC5[ | 60.8 | 42 | 69.7 | 31 | 41.0 |
| YOLOv5s | 66.8 | 54 | 83.2 | 46 | 7.2 |
| YOLOv7 | 67.6 | 55 | 84.6 | 48 | 14.4 |
| PANet[ | 61.4 | 62 | 80.5 | 48 | 14.6 |
| RoIMix[ | 70.2 | 42 | 84.2 | 45 | 17.4 |
| LKCA-YOLOv5[ | 72.1 | 48 | 87.3 | 33 | 20.6 |
| YOLOv8s | 68.9 | 58 | 85.7 | 52 | 12.3 |
| WCA-YOLOv8 | 75.8 | 60 | 88.6 | 57 | 15.9 |
Tab. 3 Comparison of experimental results of different models
| 模型 | RUOD | UPRC | S/MB | ||
|---|---|---|---|---|---|
| mAP0.5/% | v/(frame·s-1) | mAP0.5/% | v/(frame·s-1) | ||
| Faster R‑CNN[ | 58.6 | 16 | 64.4 | 15 | 42.0 |
| SSD300[ | 66.4 | 58 | 71.3 | 49 | 28.0 |
| DETR-DC5[ | 60.8 | 42 | 69.7 | 31 | 41.0 |
| YOLOv5s | 66.8 | 54 | 83.2 | 46 | 7.2 |
| YOLOv7 | 67.6 | 55 | 84.6 | 48 | 14.4 |
| PANet[ | 61.4 | 62 | 80.5 | 48 | 14.6 |
| RoIMix[ | 70.2 | 42 | 84.2 | 45 | 17.4 |
| LKCA-YOLOv5[ | 72.1 | 48 | 87.3 | 33 | 20.6 |
| YOLOv8s | 68.9 | 58 | 85.7 | 52 | 12.3 |
| WCA-YOLOv8 | 75.8 | 60 | 88.6 | 57 | 15.9 |
| 模型结构 | S/MB | T/min | C/GFLOPs | v/(frame·s-1) | mAP0.5/% | |||
|---|---|---|---|---|---|---|---|---|
| D | F | L | M | |||||
| 12.3 | 157 | 16.7 | 58 | 68.9 | ||||
| √ | 9.5 | 139 | 13.2 | 67 | 70.1 | |||
| √ | 14.4 | 169 | 18.3 | 54 | 71.4 | |||
| √ | 14.2 | 166 | 18.7 | 53 | 72.4 | |||
| √ | 13.2 | 162 | 17.9 | 56 | 71.6 | |||
| √ | √ | 11.3 | 151 | 15.2 | 63 | 72.5 | ||
| √ | √ | √ | 13.8 | 154 | 15.7 | 62 | 74.7 | |
| √ | √ | √ | √ | 15.9 | 158 | 16.4 | 60 | 75.8 |
Tab. 4 Results of WCA-YOLOv8 ablation experiments
| 模型结构 | S/MB | T/min | C/GFLOPs | v/(frame·s-1) | mAP0.5/% | |||
|---|---|---|---|---|---|---|---|---|
| D | F | L | M | |||||
| 12.3 | 157 | 16.7 | 58 | 68.9 | ||||
| √ | 9.5 | 139 | 13.2 | 67 | 70.1 | |||
| √ | 14.4 | 169 | 18.3 | 54 | 71.4 | |||
| √ | 14.2 | 166 | 18.7 | 53 | 72.4 | |||
| √ | 13.2 | 162 | 17.9 | 56 | 71.6 | |||
| √ | √ | 11.3 | 151 | 15.2 | 63 | 72.5 | ||
| √ | √ | √ | 13.8 | 154 | 15.7 | 62 | 74.7 | |
| √ | √ | √ | √ | 15.9 | 158 | 16.4 | 60 | 75.8 |
| 1 | XU S, ZHANG M, SONG W, et al. A systematic review and analysis of deep learning-based underwater object detection[J]. Neurocomputing, 2023, 527: 204-232. |
| 2 | GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]// Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2014: 580-587. |
| 3 | GIRSHICK R. Fast R-CNN[C]// Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2015: 1440-1448. |
| 4 | REN S, HE K, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[C]// Proceedings of the 28th International Conference on Neural Information Processing Systems — Volume 1. Cambridge: MIT Press, 2015:91-99. |
| 5 | LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot MultiBox detector[C]// Proceedings of the 2016 European Conference on Computer Vision, LNCS 9905. Cham: Springer, 2016: 21-37. |
| 6 | REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 779-788. |
| 7 | 陶洋,赵文博,钟邦乾,等.融合大卷积核注意力机制的水下目标检测算法[J/OL].小型微型计算机系统,2023 [2024-04-10].. |
| TAO Y, ZHAO W B, ZHONG B Q, et al. Underwater target detection algorithm with large kernel convolutional attention mechanism[J/OL]. Journal of Chinese Computer Systems, 2023 [2024-04-10].. | |
| 8 | BAO Z, GUO Y, WANG J, et al. Underwater target detection based on parallel high-resolution networks[J]. Sensors, 2023, 23(17): No.7337. |
| 9 | 陈宇梁,董绍江,朱孙科,等.改进的YOLOv3浅海水下生物目标检测[J].计算机工程与应用,2023,59(18):190-197. |
| CHEN Y L, DONG S J, ZHU S K, et al. Improved YOLOv3 shallow sea underwater biological target detection[J]. Computer Engineering and Applications, 2023, 59(18):190-197. | |
| 10 | 刘萍,杨鸿波,宋阳.改进YOLOv3网络的海洋生物识别算法[J]. 计算机应用研究,2020,37(S1):394-397. |
| LIU P, YANG H B, SONG Y. Improved YOLOv3 network based marine biometric recognition algorithm[J]. Application Research of Computers, 2020, 37(S1):394-397. | |
| 11 | SANDLER M, HOWARD A, ZHU M, et al. MobileNetV2: inverted residuals and linear bottlenecks[C]// Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 4510-4520. |
| 12 | HOU Q, ZHOU D, FENG J. Coordinate attention for efficient mobile network design[C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 13708-13717. |
| 13 | MA N, ZHANG X, SUN J. Funnel activation for visual recognition[C]// Proceedings of the 2020 European Conference on Computer Vision, LNCS 12356. Cham: Springer, 2020: 351-368. |
| 14 | ZHANG Y F, REN W, ZHANG Z, et al. Focal and efficient IoU loss for accurate bounding box regression[J]. Neurocomputing, 2022, 506: 146-157. |
| 15 | LI X, WANG W, WU L, et al. Generalized focal loss: learning qualified and distributed bounding boxes for dense object detection[C]// Proceedings of the 34th International Conference on Neural Information Processing Systems. Red Hook, NY: Curran Associates Inc., 2020: 21002-21012. |
| 16 | ZHENG Z, WANG P, LIU W, et al. Distance-IoU loss: faster and better learning for bounding box regression[C]// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto, CA: AAAI Press, 2020: 12993-13000. |
| 17 | TONG Z, CHEN Y, XU Z, et al. Wise-IoU: bounding box regression loss with dynamic focusing mechanism[EB/OL]. (2023-04-08) [2024-04-10]. |
| 18 | REZATOFIGHI H, TSOI N, GWAK J, et al. Generalized intersection over union: a metric and a loss for bounding box regression[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 658-666. |
| 19 | ZENG L, SUN B, ZHU D. Underwater target detection based on Faster R-CNN and adversarial occlusion network[J]. Engineering Applications of Artificial Intelligence, 2021, 100: No.104190. |
| 20 | CARION N, MASSA F, SYNNAEVE G, et al. End-to-end object detection with Transformers[C]// Proceedings of the 2020 European Conference on Computer Vision, LNCS 12346. Cham: Springer, 2020: 213-229. |
| 21 | LEI F, TANG F, LI S. Underwater target detection algorithm based on improved YOLOv5[J]. Journal of Marine Science and Engineering, 2022, 10(3): No.310. |
| 22 | LIN W H, ZHONG J X, LIU S, et al. RoIMix: proposal-fusion among multiple images for underwater object detection[C]// Proceedings of the 2020 IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway: IEEE, 2020: 2588-2592. |
| 23 | STERGIOU A, POPPE R, KALLIATAKIS G. Refining activation downsampling with SoftPool[C]// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 10337-10346. |
| 24 | HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 7132-7141. |
| 25 | WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]// Proceedings of the 2018 European Conference on Computer Vision, LNCS 11211. Cham: Springer, 2018: 3-19. |
| [1] | Yexin PAN, Zhe YANG. Optimization model for small object detection based on multi-level feature bidirectional fusion [J]. Journal of Computer Applications, 2024, 44(9): 2871-2877. |
| [2] | Ruihua LIU, Zihe HAO, Yangyang ZOU. Gait recognition algorithm based on multi-layer refined feature fusion [J]. Journal of Computer Applications, 2024, 44(7): 2250-2257. |
| [3] | Mengyuan HUANG, Kan CHANG, Mingyang LING, Xinjie WEI, Tuanfa QIN. Progressive enhancement algorithm for low-light images based on layer guidance [J]. Journal of Computer Applications, 2024, 44(6): 1911-1919. |
| [4] | Yue LIU, Fang LIU, Aoyun WU, Qiuyue CHAI, Tianxiao WANG. 3D object detection network based on self-attention mechanism and graph convolution [J]. Journal of Computer Applications, 2024, 44(6): 1972-1977. |
| [5] | Huantong GENG, Zhenyu LIU, Jun JIANG, Zichen FAN, Jiaxing LI. Embedded road crack detection algorithm based on improved YOLOv8 [J]. Journal of Computer Applications, 2024, 44(5): 1613-1618. |
| [6] | Xin LI, Qiao MENG, Junyi HUANGFU, Lingchen MENG. YOLOv5 multi-attribute classification based on separable label collaborative learning [J]. Journal of Computer Applications, 2024, 44(5): 1619-1628. |
| [7] | Hongtian LI, Xinhao SHI, Weiguo PAN, Cheng XU, Bingxin XU, Jiazheng YUAN. Few-shot object detection via fusing multi-scale and attention mechanism [J]. Journal of Computer Applications, 2024, 44(5): 1437-1444. |
| [8] | Guijin HAN, Xinyuan ZHANG, Wentao ZHANG, Ya HUANG. Self-supervised image registration algorithm based on multi-feature fusion [J]. Journal of Computer Applications, 2024, 44(5): 1597-1604. |
| [9] | Zhanjun JIANG, Baijing WU, Long MA, Jing LIAN. Faster-RCNN water-floating garbage recognition based on multi-scale feature and polarized self-attention [J]. Journal of Computer Applications, 2024, 44(3): 938-944. |
| [10] | Xinye LI, Yening HOU, Yinghui KONG, Zhiqi YAN. Few-shot object detection combining feature fusion and enhanced attention [J]. Journal of Computer Applications, 2024, 44(3): 745-751. |
| [11] | Ning WU, Yangyang LUO, Huajie XU. Semantic segmentation method for remote sensing images based on multi-scale feature fusion [J]. Journal of Computer Applications, 2024, 44(3): 737-744. |
| [12] | Yuliang ZHENG, Yunhua CHEN, Weijie BAI, Pinghua CHEN. Vehicle target detection by fusing event data and image frames [J]. Journal of Computer Applications, 2024, 44(3): 931-937. |
| [13] | Zongze JIA, Pengfei GAO, Yinglong MA, Xiaofeng LIU, Haixin XIA. Multi-feature fusion attention-based hierarchical classification method for dialogue act [J]. Journal of Computer Applications, 2024, 44(3): 715-721. |
| [14] | Ziqi HUANG, Jianpeng HU. Entity category enhanced nested named entity recognition in automotive domain [J]. Journal of Computer Applications, 2024, 44(2): 377-384. |
| [15] | Qiaoling HUANG, Bochuan ZHENG, Zicheng DING, Zedong WU. Improved image inpainting network incorporating supervised attention module and cross-stage feature fusion [J]. Journal of Computer Applications, 2024, 44(2): 572-579. |
| Viewed | ||||||
|
Full text |
|
|||||
|
Abstract |
|
|||||