Journal of Computer Applications ›› 0, Vol. ›› Issue (): 280-285.DOI: 10.11772/j.issn.1001-9081.2024050621

• Multimedia computing and computer simulation • Previous Articles     Next Articles

Improvement strategy of YOLO algorithm for small target detection from high-altitude view

Jiayu CAO, Guifang QIAO(), Mengyuan CHEN, Xu ZOU, Di LIU   

  1. School of Automation,Nanjing Institute of Technology,Nanjing Jiangsu 211167,China
  • Received:2024-05-15 Revised:2024-07-09 Accepted:2024-07-15 Online:2025-01-24 Published:2024-12-31
  • Contact: Guifang QIAO

面向高空视角小目标检测的YOLO算法改进策略

曹嘉禹, 乔贵方(), 陈梦源, 邹旭, 刘娣   

  1. 南京工程学院 自动化学院,南京 211167
  • 通讯作者: 乔贵方
  • 作者简介:曹嘉禹(1999—),男,江苏苏州人,硕士研究生,主要研究方向:目标检测、三维建图与路径规划
    乔贵方(1987—),男,江苏徐州人,副教授,博士,主要研究方向:工业机器人测试与标定、仿生机器人控制
    陈梦源(2000—),女,江苏徐州人,硕士研究生,主要研究方向:移动机器人建图与导航
    邹旭(1999—),男,重庆人,硕士研究生,主要研究方向:电力巡检机器人
    刘娣(1983—),女,江苏淮安人,教授,博士,主要研究方向:卫星导航及组合导航、智能控制、机器人建模与控制。
  • 基金资助:
    江苏省研究生科研与实践创新计划项目(SJCX23_1159)

Abstract:

Aiming at the problems of complex background, insufficient feature extraction ability, small target size, difficult detection, and missed detection in target detection of unmanned aerial vehicles from high-altitude view, an improved target detection algorithm based on YOLOV8n for unmanned aerial vehicles from high-altitude view was proposed. Firstly, the network structure was optimized, and small target perception ability was improved by adding P2 small target detection layer and deleting P5 large target detection layer. Secondly, Receptive Field Attention Convolution (RFAConv) was introduced to improve the Bottleneck of C2f, and the capabilities of feature extraction and fusion were enhanced from spatial dimension. Thirdly, in order to enhance the capabilities of expression and generalization, Dynamic head (Dyhead) module was introduced into Detect detection head. Finally, Normalized Wasserstein Distance (NWD) was used in bounding box similarity measurement to reduce scale sensitivity. The improved YOLOv8n, YOLOv9t and YOLOv10n increase the Average Precision (AP) by 15.6%, 16.7% and 31.0%, respectively, on Visdrone2019 dataset. The detection results on SAR Ship Detection Dataset (SSDD) confirm that the improved algorithm has a strong generalization capability and is more robust. It can be seen that the improved algorithm enhances small target feature extraction and fusion capabilities and has better detection effects in small target detection.

Key words: YOLOv8n, small object detection, Receptive Field Attention Convolution (RFAConv), Dynamic head (Dyhead), Normalized Wasserstein Distance (NWD)

摘要:

针对无人机高空视角下背景复杂、特征提取能力不足、目标尺寸小、难以检测、漏检严重等问题,在YOLOv8n的基础上提出改进的无人机高空视角目标检测算法。首先优化网络结构,通过增加P2小目标检测层并删去P5大目标检测层提升小目标感知能力;其次引入感受野注意力卷积(RFAConv)以改进C2f的颈部结构,并从空间维度提高网络的特征提取及特征融合能力;再次将动态头(Dyhead)模块引入Detect检测头,从而增强模型的表达能力和泛化能力;最后使用归一化Wasserstein距离(NWD)度量边界框相似性,从而降低尺度敏感性。在Visdrone2019数据集上,改进后的YOLOv8n、YOLOv9t和YOLOv10n与改进前的相比,在平均精度(AP)上分别提升了15.6%、16.7%和31.0%;在SAR舰船检测数据集(SSDD)上的检测结果表明改进算法泛化能力较好,具有较强的鲁棒性。可见改进后的算法提升了小目标特征提取及融合能力,并具有更好的小目标检测效果。

关键词: YOLOv8n, 小目标检测, 感受野注意力卷积, 动态头, 归一化Wasserstein距离

CLC Number: