Small target detection model for UAV aerial photography based on improved YOLOv8

doi:10.11772/j.issn.1001-9081.2024070946

Abstract

Abstract:

Aiming at the current problem of low performance as well as missed and false detection of small targets in Unmanned Aerial Vehicle （UAV） perspective， an improved BDS-YOLO （BiFPN-Dual-Small target detection-YOLO） model based on YOLOv8 was proposed. Firstly， RepViTBlock （Revisiting mobile CNN from ViT perspective Block） and EMA （Efficient Multi-scale Attention） mechanism were used to construct C2f-RE （C2f-RepViTBlock Efficient multi-scale attention） to improve deep C2f （faster implementation of CSP bottleneck with 2 Convolutions） module in the backbone network， thereby enhancing the model’s ability to extract small target features and reducing the number of parameters. Secondly， BiFPN （Bi-directional Feature Pyramid Network） was used to reconstruct the neck network， so that features at different levels were able to be fused with each other. Thirdly， a dual small target detection layer was constructed on the basis of the improved neck network， and the layer was combined with shallow and shallowest features to improve detection ability of the model for small targets. Finally， the improved loss function Inner-EIoU （Inner-Efficient-Intersection over Union） was introduced. In this function， a more reasonable aspect ratio measure method was used and the limitations of IoU （Intersection over Union） itself were addressed. Experimental results show that compared to the original model on VisDrone2019 dataset， the improved model improves the precision， recall， mAP@50， and mAP@50：95 by 8.5， 7.7， 9.2， and 6.3 percentage points， respectively， with parameters of only 2.23×10⁶， which means a reduction in model size of 19.1%. It can be seen that the proposed model improves performance significantly while achieving certain lightweight.

Key words: YOLOv8, C2f (faster implementation of CSP bottleneck with 2 Convolutions), Unmanned Aerial Vehicle (UAV), small target detection, Inner-EIoU (Inner-Efficient-Intersection over Union)

摘要：

针对当前无人机（UAV）视角下小目标检测性能低以及漏检和误检的问题，提出基于YOLOv8改进的BDS-YOLO （BiFPN-Dual-Small target detection-YOLO）模型。首先，使用RepViTBlock（Revisiting mobile CNN from ViT perspective Block）与EMA（Efficient Multi-scale Attention）机制构造C2f-RE （C2f-RepViTBlock Efficient multi-scale attention）从而改进骨干网络中深层的C2f （faster implementation of CSP bottleneck with 2 Convolutions）模块，提升模型对小目标特征的提取能力并降低参数量；其次，使用双向特征金字塔网络（BiFPN）重构颈部网络，从而使不同层级的特征得以相互融合；然后，在改进颈部网络的基础上构造双重小目标检测层，并结合浅层和最浅层特征来提高模型对小目标的检测能力；最后，引入改进损失函数Inner-EIoU （Inner-Efficient-Intersection over Union），该函数使用更合理的宽高比衡量方式并解决交并比（IoU）自身的局限。实验结果表明，改进模型在VisDrone2019数据集上相对原始模型的精确率、召回率、mAP@50、mAP@50：95分别提升了8.5、7.7、9.2和6.3个百分点，而参数量仅为2.23×10⁶，模型大小减小了19.1%。可见，所提模型在实现一定轻量化的同时显著提升了性能。

关键词: YOLOv8, C2f, 无人机, 小目标检测, Inner-EIoU

CLC Number:

TP391.4

Bogan FAN, Shuqing WANG, Kaiyuan CHEN. Small target detection model for UAV aerial photography based on improved YOLOv8[J]. Journal of Computer Applications, 2025, 45(7): 2342-2350.

范博淦, 王淑青, 陈开元. 基于改进YOLOv8的航拍无人机小目标检测模型[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2342-2350.

Figures/Tables 15

References 39

[1]	LIN T Y， GOYAL P， GIRSHICK R， et al. Focal loss for dense object detection ［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 2999-3007.
[2]	LIU W， ANGUELOV D， ERHAN D， et al. SSD： single shot multiBox detector ［C］// Proceedings of the 2016 European Conference on Computer Vision， LNCS 9905. Cham： Springer， 2016： 21-37.
[3]	REDMON J， DIVVALA S， GIRSHICK R， et al. You only look once： unified， real-time object detection ［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 779-788.
[4]	REDMON J， FARHADI A. YOLO9000： better， faster， stronger ［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 6517-6525.
[5]	REDMON J， FARHADI A. YOLOv3： an incremental improvement ［EB/OL］. ［2024-07-01］. .
[6]	范江霞，张文豪，张丽丽，等.改进YOLOv5的无人机影像车辆检测方法［J］.遥感信息，2023， 38（3）： 114-121.
	FAN J X， ZHANG W H， ZHANG L L， et al. Vehicle detection method of UAV imagery based on improved YOLOv5 ［J］. Remote Sensing Information， 2023， 38（3）： 114-121.
[7]	张华卫，张文飞，蒋占军，等.引入上下文信息和Attention Gate的GUS-YOLO遥感目标检测算法［J］.计算机科学与探索，2024， 18（2）： 453-464.
	ZHANG H W， ZHANG W F， JIANG Z J， et al. GUS-YOLO remote sensing target detection algorithm introducing context in-information and Attention Gate ［J］. Journal of Frontiers of Computer Science and Technology， 2024， 18（2）： 453-464.
[8]	黄海生，饶雪峰.面向无人机航拍场景的轻量化目标检测［J］.计算机系统应用，2022， 31（12）： 159-168.
	HUANG H S， RAO X F. Lightweight object detection for drone-captured scenarios ［J］. Computer Systems and Applications， 2022， 31（12）： 159-168.
[9]	潘玮，韦超，钱春雨，等.面向无人机视角下小目标检测的YOLOv8s改进模型［J］.计算机工程与应用，2024， 60（9）： 142-150.
	PAN W， WEI C， QIAN C Y， et al. Improved YOLOv8s model for small object detection from perspective of drones ［J］. Computer Engineering and Applications， 2024， 60（9）： 142-150.
[10]	梁秀满，贾梓涵，刘振东，等.改进YOLOv8n的无人机航拍图像检测算法［J/OL］.电光与控制［2024-09-10］..
	LIANG X M， JIA Z H， LIU Z D， et al. A UAV aerial image detection algorithm based on improved YOLOv 8n ［J/OL］. Electronics Optics and Control［2024-09-10］..
[11]	沈学利，王灵超.基于YOLOv8n的无人机航拍目标检测［J］.计算机系统应用，2024， 33（7）： 139-148.
	SHEN X L， WANG L C. UAV aerial photography target detection based on YOLOv8n ［J］. Computer Systems and Applications， 2024， 33（7）： 139-148.
[12]	李岩超，史卫亚，冯灿.面向无人机航拍小目标检测的轻量级YOLOv8检测算法［J］.计算机工程与应用，2024， 60（17）： 167-178.
	LI Y C， SHI W Y， FENG C. Lightweight YOLOv8 detection algorithm for small object detection in UAV aerial photography ［J］. Computer Engineering and Applications， 2024， 60（17）： 167-178.
[13]	HOU Q， ZHOU D， FENG J. Coordinate attention for efficient mobile network design ［C］// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2021： 13708-13717.
[14]	WANG A， CHEN H， LIN Z， et al. Rep ViT： revisiting mobile CNN from ViT perspective ［C］// Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2024： 15909-15920.
[15]	OUYANG D， HE S， ZHANG G， et al. Efficient multi-scale attention module with cross-spatial learning ［C］// Proceedings of the 2023 IEEE International Conference on Acoustics， Speech and Signal Processing. Piscataway： IEEE， 2023： 1-5.
[16]	TAN M， PANG R， LE Q V. EfficientDet： scalable and efficient object detection ［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 10778-10787.
[17]	BOCHKOVSKIY A， WANG C Y， LIAO H Y M. YOLOv4： optimal speed and accuracy of object detection ［EB/OL］. ［2024-05-10］. .
[18]	LIN T Y， DOLLÁR P， GIRSHICK R， et al. Feature pyramid networks for object detection ［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 936-944.
[19]	ZHANG H， XU C， ZHANG S. Inner-IoU： more effective intersection over union loss with auxiliary bounding box ［EB/OL］. ［2024-07-01］. .
[20]	ZHANG Y F， REN W， ZHANG Z， et al. Focal and efficient IOU loss for accurate bounding box regression ［J］. Neurocomputing， 2022， 506： 146-157.
[21]	ZHENG Z， WANG P， LIU W， et al. Distance-IoU loss： faster and better learning for bounding box regression ［C］// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto： AAAI Press， 2020： 12993-13000.
[22]	GE Z， LIU S， WANG F， et al. YOLOx： exceeding YOLO series in 2021 ［EB/OL］. ［2024-07-01］. .
[23]	JIANG B， LOU R， MAO J， et al. Acquisition of localization confidence for accurate object detection ［C］// Proceedings of the 2018 European Conference on Computer Vision， LNCS 11218. Cham： Springer， 2018： 816-832.
[24]	REZATOFIGHI H， TSOI N， GWAK J， et al. Generalized intersection over union： a metric and a loss for bounding box regression ［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2019： 658-666.
[25]	DOSOVITSKIY A， BEYER L， KOLESNIKOV A， et al. An image is worth 16x16 words： Transformers for images recognition at scale ［EB/OL］. ［2024-07-01］. .
[26]	HOWARD A， SANDLER M， CHU G， et al. Searching for MobileNetV3 ［C］// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2019： 1314-1324.
[27]	HU J， SHEN L， SUN G. Squeeze-and-excitation networks ［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 7132-7141.
[28]	YIN X， GOUDRIAAN JAN， LANTINGA E A， et al. A flexible sigmoid function of determinate growth ［J］. Annals of Botany， 2003， 91（3）： 361-371.
[29]	DU D， ZHU P， WEN L， et al. VisDrone-DET2019： the vision meets drone object detection in image challenge results ［C］// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshops. Piscataway： IEEE， 2019： 213-226.
[30]	GEVORGYAN Z. SIoU loss： more powerful learning for bounding box regression ［EB/OL］. ［2024-06-05］. .
[31]	WANG J， XU C， YANG W， et al. A normalized Gaussian Wasserstein distance for tiny object detection ［EB/OL］. ［2024-07-01］. .
[32]	MA S， XU Y. MPDIoU： a loss for efficient and accurate bounding box regression ［EB/OL］. ［2024-06-05］. .
[33]	KANG M， TING C M， TING F F， et al. ASF-YOLO： a novel YOLO model with attentional scale sequence fusion for cell instance segmentation ［J］. Image and Vision Computing， 2024， 147： No.105057.
[34]	王舒梦，徐慧英，朱信忠，等.基于改进YOLOv8n航拍轻量化小目标检测算法：PECS-YOLO ［J/OL］.计算机工程［2024-06-01］. .
	WANG S M， XU H Y， ZHU X Z， et al. Lightweight small object detection algorithm based on improved YOLOv8n aerial photography： PECS-YOLO ［J/OL］. Computer Engineering ［2024-06-01］. .
[35]	ZHANG Z. Drone-YOLO： an efficient neural network method for target detection in drone images ［J］. Drones， 2023， 7（8）： No.526.
[36]	李姝，李思远，刘国庆.基于YOLOv8无人机航拍图像的小目标检测算法研究［J］.小型微型计算机系统，2024， 45（9）： 2165-2174.
	LI S， LI S Y， LIU G Q. Research on small target detection algorithm based on YOLOv8 UAV aerial images ［J］. Journal of Chinese Computer Systems， 2024， 45（9）： 2165-2174.
[37]	胡清翔，饶文碧，熊盛武.面向无人机遥感场景的轻量级小目标检测算法［J］.计算机工程，2023， 49（12）： 169-177.
	HU Q X， RAO W B， XIONG S W. Lightweight small object detection algorithm for UAV remote sensing scene ［J］. Computer Engineering， 2023， 49（12）： 169-177.
[38]	WANG G， CHEN Y， AN P， et al. UAV-YOLOv8： a small-object -detection model based on improved YOLOv8 for UAV aerial photography scenarios ［J］. Sensors， 2023， 23（16）： No.7190.
[39]	付锦燚，张自嘉，孙伟，等.改进YOLOv8的航拍图像小目标检测算法［J］.计算机工程与应用，2024， 60（6）： 100-109.
	FU J Y， ZHANG Z J， SUN W， et al. Improved YOLOv8 small target detection algorithm in aerial images ［J］. Computer Engineering and Applications， 2024， 60（6）： 100-109.

参数	设置	参数	设置
epochs	200	Irf	0.01
imgsz	640	patience	20
batch	8	Ir0	0.01
workers	8	momentum	0.937

参数	设置	参数	设置
epochs	200	Irf	0.01
imgsz	640	patience	20
batch	8	Ir0	0.01
workers	8	momentum	0.937

位置	精确率	召回率	mAP@50	mAP@50：95
P5	0.531	0.416	0.424	0.248
P4 P5	0.516	0.413	0.423	0.246
P3 P4 P5	0.511	0.411	0.418	0.244
P2 P3 P4 P5	0.512	0.407	0.414	0.241
P1 P2 P3 P4 P5	0.514	0.403	0.411	0.238

位置	精确率	召回率	mAP@50	mAP@50：95
P5	0.531	0.416	0.424	0.248
P4 P5	0.516	0.413	0.423	0.246
P3 P4 P5	0.511	0.411	0.418	0.244
P2 P3 P4 P5	0.512	0.407	0.414	0.241
P1 P2 P3 P4 P5	0.514	0.403	0.411	0.238

ratio	精确率	召回率	mAP@50	mAP@50：95
0.8	0.461	0.337	0.332	0.188
0.9	0.458	0.339	0.333	0.190
1.0	0.439	0.342	0.333	0.189
1.1	0.465	0.343	0.338	0.192
1.2	0.460	0.332	0.332	0.190
1.3	0.444	0.343	0.333	0.190
1.4	0.464	0.336	0.332	0.189