Journal of Computer Applications ›› 2025, Vol. 45 ›› Issue (7): 2342-2350.DOI: 10.11772/j.issn.1001-9081.2024070946

• Multimedia computing and computer simulation • Previous Articles     Next Articles

Small target detection model for UAV aerial photography based on improved YOLOv8

Bogan FAN, Shuqing WANG(), Kaiyuan CHEN   

  1. School of Electrical and Electronic Engineering,Hubei University of Technology,Wuhan Hubei 430068,China
  • Received:2024-07-08 Revised:2024-10-11 Accepted:2024-10-11 Online:2025-07-10 Published:2025-07-10
  • Contact: Shuqing WANG
  • About author:FAN Bogan, born in 2001, M. S. candidate. His research interests include target detection, image processing.
    WANG Shuqing, born in 1968, Ph. D., professor. Her research interests include industrial control, machine vision, image processing.
    CHEN Kaiyuan, born in 2000, M. S. candidate. His research interests include target detection, image processing.
  • Supported by:
    National Natural Science Foundation of China(62306107)

基于改进YOLOv8的航拍无人机小目标检测模型

范博淦, 王淑青(), 陈开元   

  1. 湖北工业大学 电气与电子工程学院,武汉 430068
  • 通讯作者: 王淑青
  • 作者简介:范博淦(2001—),男,安徽六安人,硕士研究生,CCF学生会员,主要研究方向:目标检测、图像处理
    王淑青(1968—),女,河北衡水人,教授,博士,主要研究方向:工业控制、机器视觉、图像处理 w2246102938@163.com
    陈开元(2000—),男,湖北十堰人,硕士研究生,主要研究方向:目标检测、图像处理。
  • 基金资助:
    国家自然科学基金资助项目(62306107)

Abstract:

Aiming at the current problem of low performance as well as missed and false detection of small targets in Unmanned Aerial Vehicle (UAV) perspective, an improved BDS-YOLO (BiFPN-Dual-Small target detection-YOLO) model based on YOLOv8 was proposed. Firstly, RepViTBlock (Revisiting mobile CNN from ViT perspective Block) and EMA (Efficient Multi-scale Attention) mechanism were used to construct C2f-RE (C2f-RepViTBlock Efficient multi-scale attention) to improve deep C2f (faster implementation of CSP bottleneck with 2 Convolutions) module in the backbone network, thereby enhancing the model’s ability to extract small target features and reducing the number of parameters. Secondly, BiFPN (Bi-directional Feature Pyramid Network) was used to reconstruct the neck network, so that features at different levels were able to be fused with each other. Thirdly, a dual small target detection layer was constructed on the basis of the improved neck network, and the layer was combined with shallow and shallowest features to improve detection ability of the model for small targets. Finally, the improved loss function Inner-EIoU (Inner-Efficient-Intersection over Union) was introduced. In this function, a more reasonable aspect ratio measure method was used and the limitations of IoU (Intersection over Union) itself were addressed. Experimental results show that compared to the original model on VisDrone2019 dataset, the improved model improves the precision, recall, mAP@50, and mAP@50:95 by 8.5, 7.7, 9.2, and 6.3 percentage points, respectively, with parameters of only 2.23×106, which means a reduction in model size of 19.1%. It can be seen that the proposed model improves performance significantly while achieving certain lightweight.

Key words: YOLOv8, C2f (faster implementation of CSP bottleneck with 2 Convolutions), Unmanned Aerial Vehicle (UAV), small target detection, Inner-EIoU (Inner-Efficient-Intersection over Union)

摘要:

针对当前无人机(UAV)视角下小目标检测性能低以及漏检和误检的问题,提出基于YOLOv8改进的BDS-YOLO (BiFPN-Dual-Small target detection-YOLO)模型。首先,使用RepViTBlock(Revisiting mobile CNN from ViT perspective Block)与EMA(Efficient Multi-scale Attention)机制构造C2f-RE (C2f-RepViTBlock Efficient multi-scale attention)从而改进骨干网络中深层的C2f (faster implementation of CSP bottleneck with 2 Convolutions)模块,提升模型对小目标特征的提取能力并降低参数量;其次,使用双向特征金字塔网络(BiFPN)重构颈部网络,从而使不同层级的特征得以相互融合;然后,在改进颈部网络的基础上构造双重小目标检测层,并结合浅层和最浅层特征来提高模型对小目标的检测能力;最后,引入改进损失函数Inner-EIoU (Inner-Efficient-Intersection over Union),该函数使用更合理的宽高比衡量方式并解决交并比(IoU)自身的局限。实验结果表明,改进模型在VisDrone2019数据集上相对原始模型的精确率、召回率、mAP@50、mAP@50:95分别提升了8.5、7.7、9.2和6.3个百分点,而参数量仅为2.23×106,模型大小减小了19.1%。可见,所提模型在实现一定轻量化的同时显著提升了性能。

关键词: YOLOv8, C2f, 无人机, 小目标检测, Inner-EIoU

CLC Number: