Journal of Computer Applications ›› 2025, Vol. 45 ›› Issue (12): 4021-4029.DOI: 10.11772/j.issn.1001-9081.2024121811

• Multimedia computing and computer simulation • Previous Articles     Next Articles

Multi-scale small target detection algorithm for UAV perspective based on channel-prior multi-scale cross-axis attention-YOLO

Hailin XIAO1, Bo TIAN1, Bin HU1, Xiangting KONG1, Yuanyuan WU1, Renyu MA2, Zhongshan ZHANG3   

  1. 1.School of Computer Science,Hubei University,Wuhan Hubei 430062,China
    2.School of Artificial Intelligence,Hubei University,Wuhan Hubei 430062,China
    3.School of Information and Electronics,Beijing Institute of Technology,Beijing 100081,China
  • Received:2024-12-25 Revised:2025-04-01 Accepted:2025-04-02 Online:2025-04-15 Published:2025-12-10
  • Contact: Hailin XIAO
  • About author:XIAO Hailin, born in 1976, Ph. D., professor. His research interests include 5G/B5G intelligent information processing.
    TIAN Bo, born in 1999, M. S. candidate. His research interests include target detection, image processing.
    HU Bin, born in 2000, M. S. candidate. His research interests include image segmentation.
    KONG Xiangting, born in 1996, M. S. candidate. Her research interests include digital image watermarking, image encryption.
    WU Yuanyuan, born in 1999, M. S. candidate. Her research interests include target detection.
    MA Renyu, born in 2001, M. S. candidate. His research interests include UAV path planning.
    ZHANG Zhongshan, born in 1974, Ph. D., professor. His research interests include 5G/B5G communication.
  • Supported by:
    National Natural Science Foundation of China(61872406);Outstanding Young and Middle-aged Science and Technology Innovation Team Program in Hubei Universities and Colleges(T2021001);Guangxi Key Science and Technology Special Project(AA24263034);Guangxi Key Research and Development Program(Guike AB25069340)

基于信道先验多尺度跨轴注意YOLO的无人机视角下多尺度小目标检测算法

肖海林1, 田波1, 胡彬1, 孔祥婷1, 吴媛媛1, 马仁煜2, 张中山3   

  1. 1.湖北大学 计算机学院,武汉 430062
    2.湖北大学 人工智能学院,武汉 430062
    3.北京理工大学 信息与电子学院,北京 100081
  • 通讯作者: 肖海林
  • 作者简介:肖海林(1976—),男,湖北黄冈人,教授,博士,主要研究方向:5G/B5G智能信息处理
    田波(1999—),男(土家族),湖北恩施人,硕士研究生,主要研究方向:目标检测、图像处理
    胡彬(2000—),男,湖北鄂州人,硕士研究生,主要研究方向:图像分割
    孔祥婷(1996—),女,山西吕梁人,硕士研究生,主要研究方向:数字图像水印、图像加密
    吴媛媛(1999—),女,河南信阳人,硕士研究生,主要研究方向:目标检测
    马仁煜(2001—),男,安徽淮南人,硕士研究生,主要研究方向:无人机路径规划
    张中山(1974—),男,河北遵化人,教授,博士,主要研究方向:5G/B5G通信。
  • 基金资助:
    国家自然科学基金资助项目(61872406);广西重大专项(桂科AA24263034);广西重点研发计划项目(桂科AB25069340);湖北省高等学校优秀中青年科技创新团队计划项目(T2021001)

Abstract:

In view of current low accuracy issue in small target detection from Unmanned Aerial Vehicle (UAV) perspective, a multi-scale small target detection algorithm from UAV perspective based on Channel-Prior-Multi-Scale cross-axis attention-YOLO (CPMS-YOLO) was proposed. Firstly, a multi-scale attention module named CPMS (Channel-Prior Multi-Scale cross-axis attention) was incorporated into the backbone network, and the module was designed to better extract and emphasize useful features in complex backgrounds. With this module, the algorithm was able to learn the location details of the region of interest more easily and improve the feature extraction ability of small targets at different scales in complex backgrounds. Secondly, the Backbone network and feature fusion network were restructured by adding a feature layer with the enriched small target semantic information, and the fusion module MultiSEAM (Multi-scale Separated and Enhancement Attention Module) was adopted to complement contextual feature information for each other, thereby detecting and recognizing small targets better. Thirdly, a C2f-RFE (C2f-Receptive Field Enhancement) module was designed to improve the deep C2f (Faster Implementation of CSP Bottleneck with 2 convolutions) module in the Neck network, so as to expand the receptive field of the feature map, thereby realizing more accurate, faster, and multi-angle localization of target features, and thus enhancing small target detection ability. Finally, a loss function named WIoUv3 (Wise-IoU (Intersection over Union) v3) was introduced to optimize the loss weights of small targets dynamically, so as to solve the difference problem between positive and negative samples in the bounding box regression task, thereby further improving the detection ability for small targets. Experimental results on the public dataset VisDrone2019 show that compared to the baseline algorithm YOLOv8s, the proposed algorithm improves the precision, recall, mAP50 (mean Average Precision at IoU threshold of 50%), and mAP50-95 (mean Average Precision at IoU thresholds from 50% to 95%) by 5.9, 5.8, 6.3, and 3.6 percentage points, respectively. It can be seen that the multi-scale small target detection algorithm for UAV perspective based on CPMS-YOLO can capture and recognize small targets more accurately.

Key words: YOLOv8, small target detection, multi-scale, Unmanned Aerial Vehicle (UAV), WIoU (Wise-IoU(Intersection over Union))

摘要:

针对当前无人机(UAV)视角下小目标检测存在的准确率低问题,提出一种基于信道先验多尺度跨轴注意YOLO (CPMS-YOLO)的UAV视角下多尺度小目标检测算法。首先,在骨干网络中融入能在复杂背景下更好地提取和强化有用特征的多尺度注意力模块CPMS(Channel-Prior-Multi-Scale cross-axis attention),该模块使算法能更容易地学习感兴趣区域的位置细节并提高对不同尺度小目标在复杂背景下的特征提取能力;其次,对骨干网络和特征融合网络进行重构,即增加一个具有丰富小目标语义信息的特征层,并通过融合模块MultiSEAM(Multi-scale Separated and Enhancement Attention Module)将上下文特征信息进行互补,从而更好地捕捉和识别小目标;再次,设计C2f-RFE(C2f-Receptive Field Enhancement)模块改进颈部网络中深层的C2f(Faster Implementation of CSP Bottleneck with 2 convolutions)模块,以增加特征图的感受野,从而更准确、更快速且多角度地定位目标特征,进而提升对小目标的检测能力;最后,引入损失函数WIoUv3 (Wise-IoU(Intersection over Union) v3)动态优化小目标的损失权值,以解决边界框回归任务中正负样本之间的差异问题,从而进一步提高对小目标的检测能力。在公共数据集VisDrone2019上的实验结果表明:与基准算法YOLOv8s相比,所提算法的精确率、召回率、mAP50(mean Average Precision at IoU threshold of 50%)和mAP50-95(mean Average Precision at IoU thresholds from 50% to 95%)分别提升了5.9、5.8、6.3和3.6个百分点。可见,基于CPMS-YOLO的UAV视角下多尺度小目标检测算法能更精确地捕捉和识别小目标。

关键词: YOLOv8, 小目标检测, 多尺度, 无人机, WIoU

CLC Number: