Underwater target detection algorithm based on improved YOLOv8

doi:10.11772/j.issn.1001-9081.2023111550

Journal of Computer Applications ›› 2024, Vol. 44 ›› Issue (11): 3610-3616.DOI: 10.11772/j.issn.1001-9081.2023111550

• Multimedia computing and computer simulation • Previous Articles Next Articles

Underwater target detection algorithm based on improved YOLOv8

Dahai LI, Bingtao LI(), Zhendong WANG

School of Information Engineering，Jiangxi University of Science and Technology，Ganzhou Jiangxi 341000，China

Received:2023-11-23 Revised:2024-03-26 Accepted:2024-04-10 Online:2024-04-12 Published:2024-11-10
Contact: Bingtao LI
About author:LI Dahai， born in 1975， Ph. D.， associate professor. His research interests include deep learning， reinforcement learning， intelligent optimization algorithms.
WANG Zhendong， born in 1982， Ph. D.， associate professor. His research interests include wireless sensor network node coverage， artificial intelligence， cybersecurity.
Supported by:
National Natural Science Foundation of China(620620237);Science Foundation of Jiangxi University of Science and Technology(205200100013)

基于改进YOLOv8的水下目标检测算法

李大海, 李冰涛(), 王振东

江西理工大学信息工程学院，江西赣州 341000

通讯作者: 李冰涛
作者简介:李大海（1975—），男，山东乳山人，副教授，博士，CCF会员，主要研究方向：深度学习、强化学习、智能优化算法
王振东（1982—），男，湖北随州人，副教授，博士，主要研究方向：无线传感器网络节点覆盖、人工智能、网络安全。
基金资助:
国家自然科学基金资助项目(620620237);江西理工大学校级基金资助项目(205200100013)

Abstract

Abstract:

Due to the unique characteristics of underwater creatures， underwater images usually exit many small targets being hard to detect and often overlapping with each other. In addition， light absorption and scattering in underwater environment can cause underwater images' color offset and blur. To overcome those challenges， an underwater target detection algorithm， namely WCA-YOLOv8， was proposed. Firstly， the Feature Fusion Module （FFM） was designed to improve the focus on spatial dimension in order to improve the recognition ability for targets with color offset and blur. Secondly， the FReLU Coordinate Attention （FCA） module was added to enhance the feature extraction ability for overlapped and occluded underwater targets. Thirdly， Complete Intersection over Union （CIoU） loss function was replaced by Wise-IoU version 3 （WIoU v3） loss function to strengthen the detection performance for small size targets. Finally， the Downsampling Enhancement Module （DEM） was designed to preserve context information during feature extraction more completely. Experimental results show that WCA-YOLOv8 achieves 75.8% and 88.6% mean Average Precision （mAP_0.5） and 60 frame/s and 57 frame/s detection speeds on RUOD and URPC datasets， respectively. Compared with other state-of-the-art underwater target detection algorithms， WCA-YOLOv8 can achieve higher detection accuracy with faster detection speed.

Key words: YOLOv8, underwater target detection, feature fusion, Wise-IoU version 3 (WIoU v3) loss function

摘要：

由于水下生物的特性，水下图像中存在较多难以检测的小目标，且目标之间经常相互遮挡，而水下环境中的光线吸收和散射也会造成水下图像的颜色偏移和模糊。针对上述问题，提出水下目标检测算法WCA-YOLOv8。首先，设计特征融合模块（FFM），增强对空间维度信息的关注，提升对模糊和颜色偏移目标的识别能力；其次，加入FCA（FReLU Coordinate Attention）模块，增强对相互重叠、遮挡水下目标的特征提取能力；再次，为了提高模型对水下小目标的检测性能，将完整交并比（CIoU）损失函数替换为WIoU v3（Wise-IoU version 3）损失函数；最后，设计下采样增强模块（DEM），使特征提取过程中保存的上下文信息更完整，改善水下目标检测的性能。RUOD和URPC数据集上的实验结果表明，WCA-YOLOv8的检测平均精度均值（mAP_0.5）分别为75.8%和88.6%，检测速度分别为60 frame/s和57 frame/s。与其他前沿的水下物体检测算法相比，WCA-YOLOv8不仅能够获得更高的检测准确性，还可达到更快的检测速度。

关键词: YOLOv8, 水下目标检测, 特征融合, WIoU v3损失函数

CLC Number:

TP37

Dahai LI, Bingtao LI, Zhendong WANG. Underwater target detection algorithm based on improved YOLOv8[J]. Journal of Computer Applications, 2024, 44(11): 3610-3616.

李大海, 李冰涛, 王振东. 基于改进YOLOv8的水下目标检测算法[J]. 《计算机应用》唯一官方网站, 2024, 44(11): 3610-3616.

Figures/Tables 12

References 25

1	XU S， ZHANG M， SONG W， et al. A systematic review and analysis of deep learning-based underwater object detection［J］. Neurocomputing， 2023， 527： 204-232.
2	GIRSHICK R， DONAHUE J， DARRELL T， et al. Rich feature hierarchies for accurate object detection and semantic segmentation［C］// Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2014： 580-587.
3	GIRSHICK R. Fast R-CNN［C］// Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2015： 1440-1448.
4	REN S， HE K， GIRSHICK R， et al. Faster R-CNN： towards real-time object detection with region proposal networks［C］// Proceedings of the 28th International Conference on Neural Information Processing Systems — Volume 1. Cambridge： MIT Press， 2015：91-99.
5	LIU W， ANGUELOV D， ERHAN D， et al. SSD： single shot MultiBox detector［C］// Proceedings of the 2016 European Conference on Computer Vision， LNCS 9905. Cham： Springer， 2016： 21-37.
6	REDMON J， DIVVALA S， GIRSHICK R， et al. You only look once： unified， real-time object detection［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 779-788.
7	陶洋，赵文博，钟邦乾，等.融合大卷积核注意力机制的水下目标检测算法［J/OL］.小型微型计算机系统，2023 ［2024-04-10］..
	TAO Y， ZHAO W B， ZHONG B Q， et al. Underwater target detection algorithm with large kernel convolutional attention mechanism［J/OL］. Journal of Chinese Computer Systems， 2023 ［2024-04-10］..
8	BAO Z， GUO Y， WANG J， et al. Underwater target detection based on parallel high-resolution networks［J］. Sensors， 2023， 23（17）： No.7337.
9	陈宇梁，董绍江，朱孙科，等.改进的YOLOv3浅海水下生物目标检测［J］.计算机工程与应用，2023，59（18）：190-197.
	CHEN Y L， DONG S J， ZHU S K， et al. Improved YOLOv3 shallow sea underwater biological target detection［J］. Computer Engineering and Applications， 2023， 59（18）：190-197.
10	刘萍，杨鸿波，宋阳.改进YOLOv3网络的海洋生物识别算法［J］. 计算机应用研究，2020，37（S1）：394-397.
	LIU P， YANG H B， SONG Y. Improved YOLOv3 network based marine biometric recognition algorithm［J］. Application Research of Computers， 2020， 37（S1）：394-397.
11	SANDLER M， HOWARD A， ZHU M， et al. MobileNetV2： inverted residuals and linear bottlenecks［C］// Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 4510-4520.
12	HOU Q， ZHOU D， FENG J. Coordinate attention for efficient mobile network design［C］// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2021： 13708-13717.
13	MA N， ZHANG X， SUN J. Funnel activation for visual recognition［C］// Proceedings of the 2020 European Conference on Computer Vision， LNCS 12356. Cham： Springer， 2020： 351-368.
14	ZHANG Y F， REN W， ZHANG Z， et al. Focal and efficient IoU loss for accurate bounding box regression［J］. Neurocomputing， 2022， 506： 146-157.
15	LI X， WANG W， WU L， et al. Generalized focal loss： learning qualified and distributed bounding boxes for dense object detection［C］// Proceedings of the 34th International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2020： 21002-21012.
16	ZHENG Z， WANG P， LIU W， et al. Distance-IoU loss： faster and better learning for bounding box regression［C］// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto， CA： AAAI Press， 2020： 12993-13000.
17	TONG Z， CHEN Y， XU Z， et al. Wise-IoU： bounding box regression loss with dynamic focusing mechanism［EB/OL］. （2023-04-08）［2024-04-10］.
18	REZATOFIGHI H， TSOI N， GWAK J， et al. Generalized intersection over union： a metric and a loss for bounding box regression［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2019： 658-666.
19	ZENG L， SUN B， ZHU D. Underwater target detection based on Faster R-CNN and adversarial occlusion network［J］. Engineering Applications of Artificial Intelligence， 2021， 100： No.104190.
20	CARION N， MASSA F， SYNNAEVE G， et al. End-to-end object detection with Transformers［C］// Proceedings of the 2020 European Conference on Computer Vision， LNCS 12346. Cham： Springer， 2020： 213-229.
21	LEI F， TANG F， LI S. Underwater target detection algorithm based on improved YOLOv5［J］. Journal of Marine Science and Engineering， 2022， 10（3）： No.310.
22	LIN W H， ZHONG J X， LIU S， et al. RoIMix： proposal-fusion among multiple images for underwater object detection［C］// Proceedings of the 2020 IEEE International Conference on Acoustics， Speech and Signal Processing. Piscataway： IEEE， 2020： 2588-2592.
23	STERGIOU A， POPPE R， KALLIATAKIS G. Refining activation downsampling with SoftPool［C］// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2021： 10337-10346.
24	HU J， SHEN L， SUN G. Squeeze-and-excitation networks［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 7132-7141.
25	WOO S， PARK J， LEE J Y， et al. CBAM： convolutional block attention module［C］// Proceedings of the 2018 European Conference on Computer Vision， LNCS 11211. Cham： Springer， 2018： 3-19.

注意力机制	RUOD		UPRC		S/MB
注意力机制	mAP_0.5/%	v/（frame·s^-1）	mAP_0.5/%	v/（frame·s^-1）	S/MB
SE^［24］	69.3	51	86.3	42	13.5
CBAM^［25］	69.8	53	86.9	43	13.9
LKCA^［7］	71.1	45	87.0	38	20.6
CA^［12］	70.8	52	86.8	45	14.1
FCA	71.4	54	87.2	46	14.4

注意力机制	RUOD		UPRC		S/MB
注意力机制	mAP_0.5/%	v/（frame·s^-1）	mAP_0.5/%	v/（frame·s^-1）	S/MB
SE^［24］	69.3	51	86.3	42	13.5
CBAM^［25］	69.8	53	86.9	43	13.9
LKCA^［7］	71.1	45	87.0	38	20.6
CA^［12］	70.8	52	86.8	45	14.1
FCA	71.4	54	87.2	46	14.4

损失函数	RUOD		UPRC
损失函数	mAP_0.5/%	v/（frame·s^-1）	mAP_0.5/%	v/（frame·s^-1）
CIoU^［16］	68.9	58	85.7	52
EIoU^［14］	69.2	54	85.8	51
GIoU^［18］	70.3	53	85.8	50
WIoU v1	71.6	54	86.1	52
WIoU v2	71.9	53	86.4	53
WIoU v3	72.4	53	86.7	53

损失函数	RUOD		UPRC
损失函数	mAP_0.5/%	v/（frame·s^-1）	mAP_0.5/%	v/（frame·s^-1）
CIoU^［16］	68.9	58	85.7	52
EIoU^［14］	69.2	54	85.8	51
GIoU^［18］	70.3	53	85.8	50
WIoU v1	71.6	54	86.1	52
WIoU v2	71.9	53	86.4	53
WIoU v3	72.4	53	86.7	53

模型	RUOD		UPRC		S/MB
模型	mAP_0.5/%	v/（frame·s^-1）	mAP_0.5/%	v/（frame·s^-1）	S/MB
Faster R‑CNN^［19］	58.6	16	64.4	15	42.0
SSD300^［5］	66.4	58	71.3	49	28.0
DETR-DC5^［20］	60.8	42	69.7	31	41.0
YOLOv5s	66.8	54	83.2	46	7.2
YOLOv7	67.6	55	84.6	48	14.4
PANet^［21］	61.4	62	80.5	48	14.6
RoIMix^［22］	70.2	42	84.2	45	17.4
LKCA-YOLOv5^［7］	72.1	48	87.3	33	20.6
YOLOv8s	68.9	58	85.7	52	12.3
WCA-YOLOv8	75.8	60	88.6	57	15.9

Underwater target detection algorithm based on improved YOLOv8

基于改进YOLOv8的水下目标检测算法

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 12

References 25

Related Articles 15

Recommended Articles

Metrics

模型结构				S/MB	T/min	C/GFLOPs	v/（frame·s^-1）	mAP_0.5/%
D	F	L	M	S/MB	T/min	C/GFLOPs	v/（frame·s^-1）	mAP_0.5/%
				12.3	157	16.7	58	68.9
√				9.5	139	13.2	67	70.1
	√			14.4	169	18.3	54	71.4
		√		14.2	166	18.7	53	72.4
			√	13.2	162	17.9	56	71.6
√	√			11.3	151	15.2	63	72.5
√	√	√		13.8	154	15.7	62	74.7
√	√	√	√	15.9	158	16.4	60	75.8

[1]	Yexin PAN, Zhe YANG. Optimization model for small object detection based on multi-level feature bidirectional fusion [J]. Journal of Computer Applications, 2024, 44(9): 2871-2877.
[2]	Ruihua LIU, Zihe HAO, Yangyang ZOU. Gait recognition algorithm based on multi-layer refined feature fusion [J]. Journal of Computer Applications, 2024, 44(7): 2250-2257.
[3]	Mengyuan HUANG, Kan CHANG, Mingyang LING, Xinjie WEI, Tuanfa QIN. Progressive enhancement algorithm for low-light images based on layer guidance [J]. Journal of Computer Applications, 2024, 44(6): 1911-1919.
[4]	Yue LIU, Fang LIU, Aoyun WU, Qiuyue CHAI, Tianxiao WANG. 3D object detection network based on self-attention mechanism and graph convolution [J]. Journal of Computer Applications, 2024, 44(6): 1972-1977.
[5]	Huantong GENG, Zhenyu LIU, Jun JIANG, Zichen FAN, Jiaxing LI. Embedded road crack detection algorithm based on improved YOLOv8 [J]. Journal of Computer Applications, 2024, 44(5): 1613-1618.
[6]	Xin LI, Qiao MENG, Junyi HUANGFU, Lingchen MENG. YOLOv5 multi-attribute classification based on separable label collaborative learning [J]. Journal of Computer Applications, 2024, 44(5): 1619-1628.
[7]	Hongtian LI, Xinhao SHI, Weiguo PAN, Cheng XU, Bingxin XU, Jiazheng YUAN. Few-shot object detection via fusing multi-scale and attention mechanism [J]. Journal of Computer Applications, 2024, 44(5): 1437-1444.
[8]	Guijin HAN, Xinyuan ZHANG, Wentao ZHANG, Ya HUANG. Self-supervised image registration algorithm based on multi-feature fusion [J]. Journal of Computer Applications, 2024, 44(5): 1597-1604.
[9]	Zhanjun JIANG, Baijing WU, Long MA, Jing LIAN. Faster-RCNN water-floating garbage recognition based on multi-scale feature and polarized self-attention [J]. Journal of Computer Applications, 2024, 44(3): 938-944.
[10]	Xinye LI, Yening HOU, Yinghui KONG, Zhiqi YAN. Few-shot object detection combining feature fusion and enhanced attention [J]. Journal of Computer Applications, 2024, 44(3): 745-751.
[11]	Ning WU, Yangyang LUO, Huajie XU. Semantic segmentation method for remote sensing images based on multi-scale feature fusion [J]. Journal of Computer Applications, 2024, 44(3): 737-744.
[12]	Yuliang ZHENG, Yunhua CHEN, Weijie BAI, Pinghua CHEN. Vehicle target detection by fusing event data and image frames [J]. Journal of Computer Applications, 2024, 44(3): 931-937.
[13]	Zongze JIA, Pengfei GAO, Yinglong MA, Xiaofeng LIU, Haixin XIA. Multi-feature fusion attention-based hierarchical classification method for dialogue act [J]. Journal of Computer Applications, 2024, 44(3): 715-721.
[14]	Ziqi HUANG, Jianpeng HU. Entity category enhanced nested named entity recognition in automotive domain [J]. Journal of Computer Applications, 2024, 44(2): 377-384.
[15]	Qiaoling HUANG, Bochuan ZHENG, Zicheng DING, Zedong WU. Improved image inpainting network incorporating supervised attention module and cross-stage feature fusion [J]. Journal of Computer Applications, 2024, 44(2): 572-579.