Multi-scale small object detection algorithm for UAV perspective based on channel-prior multi-scale cross-axis attention-YOLO

doi:10.11772/j.issn.1001-9081.2024121811

Abstract

Abstract: Abstract: In view of low accuracy in small target detection from the current UAV perspective, a multi-scale small object detection algorithm for UAV perspective based on channel-prior-multi-scale cross-axis attention-YOLO（CPMS-YOLO） was proposed. Firstly, a multi-scale attention module of the algorithm named CPMS (Channel-Prior Multi-Scale Cross-Axis Attention) was incorporated into the backbone network, which is designed to better extract and emphasize useful features in complex backgrounds. In this case, it made the algorithm more easily learn the location details of the region of interest and improved the feature extraction ability of small targets at different scales in a complex context. Secondly, the backbone and feature fusion networks were restructured by adding a feature layer with the enriched small target semantic information, and the multi-scale MultiSEAM (Multi-scale Separated and Enhancement Attention Module) fusion module was adopted to complement contextual feature information for enabling better detection and recognition of small objects. And then, a C2f-RFE (C2f-Receptive Field Enhancement) module was designed to expand the receptive field of the feature map, which could provide for more accurate, faster, and multi-angle localization of target features, and thus enhancing small object detection. Finally, a loss function named WIoUv3 (Wise-IoUv3) was introduced to dynamically optimize the loss weights for small objects. The corresponding imbalance between positive and negative samples were overcome in the bounding box regression task, and further improved the algorithm’s detection performance for small target. The experimental results on the public VisDrone2019 dataset show that, compared to the baseline algorithm YOLOv8s, the proposed CPMS-YOLO based multi-scale small target detection algorithm improves precision, recall, mAP50, and mAP50-95 by 6.7, 5.8, 6.3, and 7.8 percentage points, respectively. It is evident that the channel-prior-multi-scale cross-axis attention-YOLO based algorithm can more accurately capture and recognize small targets.

Key words: YOLOv8, small target detection, Multi-scale, unmanned aerial vehicle, Wise-IoU (WIoU)

摘要： 针对当前无人机视角下小目标检测存在准确率低问题，提出一种基于信道先验多尺度跨轴注意YOLO(CPMS-YOLO)的无人机视角下多尺度小目标检测算法。首先，该算法在骨干网络中融入能在复杂背景下更好地提取和强化有用特征的多尺度注意力模块CPMS(Channel-Prior-Multi-Scale Cross-Axis Attention)，模块使算法能够更容易的学习到感兴趣区域的位置细节并提高对不同尺度小目标在复杂背景下的特征提取能力。其次，对骨干网络和特征融合网络进行重构，增加一个具有丰富小目标语义信息的特征层，并通过多尺度MultiSEAM(Multi-scale Separated and Enhancement Attention Module)融合模块，将上下文特征信息进行互补，能更好地捕捉和识别小目标。再次，通过设计C2f-RFE(C2f-Receptive Field Enhancement)模块来改进颈部网络中深层的C2f(Faster Implementation of CSP Bottleneck with 2 convolutions)模块，以增加特征图的感受野，使其能更准确、更快速、多角度地定位目标特征，提升对小目标的检测能力。最后，引入损失函数WIoUv3(Wise-IoU v3)去动态优化小目标的损失权值，解决边界框回归任务中正负样本之间的差异问题，提高算法对小目标的检测性能。在公共VisDrone2019数据集上的实验结果表明：与基准算法YOLOv8s相比，本文算法的精确率(Precision)、召回率(Recall)、mAP50、mAP50-95分别提升了6.7、5.8、6.3、7.8个百分点。可见，基于信道先验多尺度跨轴注意YOLO的无人机视角下多尺度小目标检测算法能更精确地捕捉和识别小目标。

关键词: YOLOv8, 小目标检测, 多尺度, 无人机, Wise-IoU (WIoU)

CLC Number:

TP391.4

肖海林田波胡彬孔祥婷吴媛媛马仁煜张中山. 基于信道先验多尺度跨轴注意YOLO的无人机视角下多尺度小目标检测算法[J]. 《计算机应用》唯一官方网站, DOI: 10.11772/j.issn.1001-9081.2024121811.

[1]	Kunyuan JIANG, Xiaoxia LI, Li WANG, Yaodan CAO, Xiaoqiang ZHANG, Nan DING, Yingyue ZHOU. Boundary-cross supervised semantic segmentation network with decoupled residual self-attention [J]. Journal of Computer Applications, 2025, 45(4): 1120-1129.
[2]	Bingquan LIN, Lei LIU, Huafeng LI, Chen LIU. Secure cluster control of UAVs under DoS attacks based on APF and DDPG algorithm [J]. Journal of Computer Applications, 2025, 45(4): 1241-1248.
[3]	Shiyue GUO, Jianwu DANG, Yangping WANG, Jiu YONG. 3D hand pose estimation combining attention mechanism and multi-scale feature fusion [J]. Journal of Computer Applications, 2025, 45(4): 1293-1299.
[4]	Xinyao LINGHU, Yan CHEN, Pengcheng ZHANG, Yi LIU, Zhiguo GUI, Wei ZHAO, Zhanhao DONG. Cervical cell nucleus image segmentation based on multi-scale guided filtering [J]. Journal of Computer Applications, 2025, 45(4): 1333-1339.
[5]	Xingwang WANG, Qingyang ZHANG, Shouyong JIANG, Yongquan DONG. Dynamic UAV path planning based on modified whale optimization algorithm [J]. Journal of Computer Applications, 2025, 45(3): 928-936.
[6]	Quan WANG, Xinyu CAO, Qidong CHEN. Roadside traffic object detection model and deployment for vehicle-road collaboration [J]. Journal of Computer Applications, 2025, 45(3): 1016-1024.
[7]	Baohua YUAN, Jialu CHEN, Huan WANG. Medical image segmentation network integrating multi-scale semantics and parallel double-branch [J]. Journal of Computer Applications, 2025, 45(3): 988-995.
[8]	Jiayang GUI, Shunji WANG, Zhengkang ZHOU, Jiashan TANG. Tunnel foreign object detection algorithm based on improved YOLOv8n [J]. Journal of Computer Applications, 2025, 45(2): 655-661.
[9]	Zhongwei ZHANG, Jun WANG, Shudong LIU, Zhiheng WANG. Object detection in remote sensing image based on multi-scale feature fusion and weighted boxes fusion [J]. Journal of Computer Applications, 2025, 45(2): 633-639.
[10]	Songsen YU, Zhifan LIN, Guopeng XUE, Jianyu XU. Lightweight large-format tile defect detection algorithm based on improved YOLOv8 [J]. Journal of Computer Applications, 2025, 45(2): 647-654.
[11]	Qiurun HE, Jie HU, Bo PENG, Tianyuan LI. Fabric defect detection algorithm based on context information and multi-scale feature fusion [J]. Journal of Computer Applications, 2025, 45(2): 640-646.
[12]	Pengcheng SONG, Lijun GUO, Rong ZHANG. Weakly supervised video anomaly detection with local-global temporal dependency [J]. Journal of Computer Applications, 2025, 45(1): 240-246.
[13]	Shang LIU, Yuwei ZHOU, Rao DAI, Linfang DONG, Meng LIU. Small target detection algorithm in remote sensing images integrating attention and contextual information [J]. Journal of Computer Applications, 2025, 45(1): 292-300.
[14]	Yan RONG, Jiawen LIU, Xinlei LI. Adaptive hybrid network for affective computing in student classroom [J]. Journal of Computer Applications, 2024, 44(9): 2919-2930.
[15]	Lingxia MU, Zhengjun ZHOU, Ban WANG, Youmin ZHANG, Xianghong XUE, Kaikai NING. Formation obstacle-avoidance and reconfiguration method for multiple UAVs [J]. Journal of Computer Applications, 2024, 44(9): 2938-2946.

Multi-scale small object detection algorithm for UAV perspective based on channel-prior multi-scale cross-axis attention-YOLO

基于信道先验多尺度跨轴注意YOLO的无人机视角下多尺度小目标检测算法

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics