Journal of Computer Applications ›› 2024, Vol. 44 ›› Issue (11): 3603-3609.DOI: 10.11772/j.issn.1001-9081.2023111644
• Multimedia computing and computer simulation • Previous Articles Next Articles
Tao LIU1,2, Shihong JU1(), Yimeng GAO1
Received:
2023-12-01
Revised:
2024-04-05
Accepted:
2024-04-12
Online:
2024-05-30
Published:
2024-11-10
Contact:
Shihong JU
About author:
LIU Tao, born in 1981, M. S., associate professor. His research interests include computer vision, intelligent data processing.Supported by:
通讯作者:
鞠事宏
作者简介:
刘涛(1981—),男,河北定州人,副教授,硕士,主要研究方向:计算机视觉、智能数据处理基金资助:
CLC Number:
Tao LIU, Shihong JU, Yimeng GAO. Small object detection algorithm from drone perspective based on improved YOLOv8n[J]. Journal of Computer Applications, 2024, 44(11): 3603-3609.
刘涛, 鞠事宏, 高一萌. 基于改进YOLOv8n的无人机视角下小目标检测算法[J]. 《计算机应用》唯一官方网站, 2024, 44(11): 3603-3609.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2023111644
模型 | 参数量/ | GFLOPs | mAP50/% | mAP50-95/% |
---|---|---|---|---|
YOLOv8n | 3.01 | 28.96 | 16.55 | |
+FE-C2f | 2.66 | 28.29 | 15.91 | |
+MCA | 3.05 | 29.26 | 16.68 | |
+MPDIoU | 3.01 | 29.37 | 16.79 | |
+SPDConv | 3.27 | 11.7 | 31.15 | 17.95 |
+std | 2.93 | 12.4 | 31.83 | 18.64 |
SFM-YOLOv8 | 2.83 | 14.9 | 33.33 | 19.62 |
Tab. 1 Ablation experiment results
模型 | 参数量/ | GFLOPs | mAP50/% | mAP50-95/% |
---|---|---|---|---|
YOLOv8n | 3.01 | 28.96 | 16.55 | |
+FE-C2f | 2.66 | 28.29 | 15.91 | |
+MCA | 3.05 | 29.26 | 16.68 | |
+MPDIoU | 3.01 | 29.37 | 16.79 | |
+SPDConv | 3.27 | 11.7 | 31.15 | 17.95 |
+std | 2.93 | 12.4 | 31.83 | 18.64 |
SFM-YOLOv8 | 2.83 | 14.9 | 33.33 | 19.62 |
模型 | mAP50/% | mAP50-95/% | FPS |
---|---|---|---|
Faster R-CNN | 22.3 | 16.3 | 15 |
RetinaNet | 24.1 | 16.9 | / |
Cascade R-CNN | 25.6 | 16.1 | / |
YOLOv5n | 27.9 | 15.9 | 61 |
YOLOv5s | 29.7 | 16.2 | 52 |
YOLOv6n | 25.8 | 14.8 | 54 |
YOLOv8n | 29.0 | 16.6 | 60 |
BD-YOLO | 32.6 | 17.9 | / |
改进YOLOv5s | 32.9 | 18.4 | 50 |
SFM-YOLOv8 | 33.3 | 19.6 | 43 |
Tab. 2 Comparative experiment results
模型 | mAP50/% | mAP50-95/% | FPS |
---|---|---|---|
Faster R-CNN | 22.3 | 16.3 | 15 |
RetinaNet | 24.1 | 16.9 | / |
Cascade R-CNN | 25.6 | 16.1 | / |
YOLOv5n | 27.9 | 15.9 | 61 |
YOLOv5s | 29.7 | 16.2 | 52 |
YOLOv6n | 25.8 | 14.8 | 54 |
YOLOv8n | 29.0 | 16.6 | 60 |
BD-YOLO | 32.6 | 17.9 | / |
改进YOLOv5s | 32.9 | 18.4 | 50 |
SFM-YOLOv8 | 33.3 | 19.6 | 43 |
1 | REN S, HE K, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[C]// Proceedings of the 28th International Conference on Neural Information Processing Systems — Volume 1. Cambridge: MIT Press, 2015: 91-99. |
2 | HE K, GKIOXARI G, DOLLÁR P, et al. Mask R-CNN[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 2980-2988. |
3 | LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot MultiBox detector[C]// Proceedings of the 2016 European Conference on Computer Vision, LNCS 9905. Cham: Springer, 2016: 21-37. |
4 | LI X, WANG W, WU L, et al. Generalized focal loss: learning qualified and distributed bounding boxes for dense object detection[C]// Proceedings of the 34th International Conference on Neural Information Processing Systems. Red Hook, NY: Curran Associates Inc., 2020: 21002-21012. |
5 | ZHENG Z, WANG P, REN D, et al. Enhancing geometric factors in model learning and inference for object detection and instance segmentation[J]. IEEE Transactions on Cybernetics, 2022, 52(8): 8574-8586. |
6 | CHEN J, KAO S H, HE H, et al. Run, don't walk: chasing higher FLOPS for faster neural networks[C]// Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2023: 12021-12031. |
7 | OUYANG D, HE S, ZHANG G, et al. Efficient multi-scale attention module with cross-spatial learning[C]// Proceedings of the 2023 IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway: IEEE, 2023: 1-5. |
8 | SILIANG M, YONG X. MPDIoU: a loss for efficient and accurate bounding box regression[EB/OL]. [2023-10-10].. |
9 | SUNKARA R, LUO T. No more strided convolutions or pooling: a new CNN building block for low-resolution images and small objects[C]// Proceedings of the 2022 Joint European Conference on Machine Learning and Knowledge Discovery in Databases, LNCS 13715. Cham: Springer, 2023: 443-459. |
10 | HOU Q, ZHOU D, FENG J. Coordinate attention for efficient mobile network design[C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 13713-13722. |
11 | HOWARD A G, ZHU M, CHEN B, et al. MobileNets: efficient convolutional neural networks for mobile vision applications[EB/OL]. [2023-08-15].. |
12 | ZHANG X, ZHOU X, LIN M, et al. ShuffleNet: an extremely efficient convolutional neural network for mobile devices[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 6848-6856. |
13 | HAN K, WANG Y, TIAN Q, et al. GhostNet: more features from cheap operations[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 1580-1589. |
14 | HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 7132-7141. |
15 | WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]// Proceedings of the 2018 European Conference on Computer Vision, LNCS 11211. Cham: Springer, 2018: 3-19 |
16 | LI X, HU X, YANG J. Spatial group-wise enhance: improving semantic feature learning in convolutional networks[EB/OL]. [2023-09-22].. |
17 | LIU H, LIU F, FAN X, et al. Polarized self-attention: towards high-quality pixel-wise regression[EB/OL]. [2023-10-12].. |
18 | MISRA D, NALAMADA T, ARASANIPALAI A U, et al. Rotate to attend: convolutional triplet attention module[C]// Proceedings of the 2021 IEEE Winter Conference on Applications of Computer Vision. Piscataway: IEEE, 2021: 3138-3147. |
19 | ZHANG Q L, YANG Y B. SA-Net: shuffle attention for deep convolutional neural networks[C]// Proceedings of the 2021 IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway: IEEE, 2021: 2235-2239. |
20 | DU D, ZHU P, WEN L, et al. VisDrone-DET2019: the vision meets drone object detection in image challenge results[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshops. Piscataway: IEEE, 2019: 213-226. |
21 | 李安达,吴瑞明,李旭东.改进YOLOv7的小目标检测算法研究[J].计算机工程与应用,2024,60(1):122-134. |
LI A D, WU R M, LI X D. Research on improving YOLOv7's small target detection algorithm[J]. Computer Engineering and Applications, 2024, 60(1): 122-134. | |
22 | 秦强强,廖俊国,周弋荀.基于多分支混合注意力的小目标检测算法[J].计算机应用,2023,43(11):3579-3586. |
QIN Q Q, LIAO J G, ZHOU Y X. Small object detection algorithm based on split mixed attention[J]. Journal of Computer Applications, 2023, 43(11): 3579-3586. | |
23 | LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 2999-3007. |
24 | 吴明杰,云利军,陈载清,等.改进YOLOv5s的无人机视角下小目标检测算法[J].计算机工程与应用,2024,60(2):191-199. |
WU M J, YUN L J, CHEN Z Q, et al. Improved YOLOv5s small target detection algorithm in UAV view[J]. Computer Engineering and Applications, 2024, 60(2): 191-199. | |
25 | 刘涛,高一萌,柴蕊,等.改进YOLOv5s的无人机视角下小目标检测算法[J].计算机工程与应用,2024,60(1):110-121. |
LIU T, GAO Y M, CHAI R, et al. Improving YOLOv5s UAV view small object detection algorithm[J]. Computer Engineering and Applications, 2024, 60(1): 110-121. | |
26 | 梁秀满,贾梓涵,于海峰,等.基于改进YOLOv7的无人机图像目标检测算法[J].无线电工程,2024,54(4):937-946. |
LIANG X M, JIA Z H, YU H F, et al. UAV image object detection algorithm based on improved YOLOv7[J]. Radio Engineering, 2024, 54(4): 937-946. |
[1] | Xin YANG, Xueni CHEN, Chunjiang WU, Shijie ZHOU. Short-term traffic flow prediction of urban highway based on variant residual model and Transformer [J]. Journal of Computer Applications, 2024, 44(9): 2947-2951. |
[2] | Jing QIN, Zhiguang QIN, Fali LI, Yueheng PENG. Diagnosis of major depressive disorder based on probabilistic sparse self-attention neural network [J]. Journal of Computer Applications, 2024, 44(9): 2970-2974. |
[3] | Liting LI, Bei HUA, Ruozhou HE, Kuang XU. Multivariate time series prediction model based on decoupled attention mechanism [J]. Journal of Computer Applications, 2024, 44(9): 2732-2738. |
[4] | Zhiqiang ZHAO, Peihong MA, Xinhong HEI. Crowd counting method based on dual attention mechanism [J]. Journal of Computer Applications, 2024, 44(9): 2886-2892. |
[5] | Yeheng LI, Guangsheng LUO, Qianmin SU. Logo detection algorithm based on improved YOLOv5 [J]. Journal of Computer Applications, 2024, 44(8): 2580-2587. |
[6] | Shuai FU, Xiaoying GUO, Ruyi BAI, Tao YAN, Bin CHEN. Age estimation method combining improved CloFormer model and ordinal regression [J]. Journal of Computer Applications, 2024, 44(8): 2372-2380. |
[7] | Kaipeng XUE, Tao XU, Chunjie LIAO. Multimodal sentiment analysis network with self-supervision and multi-layer cross attention [J]. Journal of Computer Applications, 2024, 44(8): 2387-2392. |
[8] | Pengqi GAO, Heming HUANG, Yonghong FAN. Fusion of coordinate and multi-head attention mechanisms for interactive speech emotion recognition [J]. Journal of Computer Applications, 2024, 44(8): 2400-2406. |
[9] | Tong CHEN, Fengyu YANG, Yu XIONG, Hong YAN, Fuxing QIU. Construction method of voiceprint library based on multi-scale frequency-channel attention fusion [J]. Journal of Computer Applications, 2024, 44(8): 2407-2413. |
[10] | Zhonghua LI, Yunqi BAI, Xuejin WANG, Leilei HUANG, Chujun LIN, Shiyu LIAO. Low illumination face detection based on image enhancement [J]. Journal of Computer Applications, 2024, 44(8): 2588-2594. |
[11] | Kaili DENG, Weibo WEI, Zhenkuan PAN. Industrial defect detection method with improved masked autoencoder [J]. Journal of Computer Applications, 2024, 44(8): 2595-2603. |
[12] | Shangbin MO, Wenjun WANG, Ling DONG, Shengxiang GAO, Zhengtao YU. Single-channel speech enhancement based on multi-channel information aggregation and collaborative decoding [J]. Journal of Computer Applications, 2024, 44(8): 2611-2617. |
[13] | Zhe KONG, Han LI, Shaowei GAN, Mingru KONG, Bingtao HE, Ziyu GUO, Ducheng JIN, Zhaowen QIU. Structure segmentation model for 3D kidney images based on asymmetric multi-decoder and attention module [J]. Journal of Computer Applications, 2024, 44(7): 2216-2224. |
[14] | Wu XIONG, Congjun CAO, Xuefang SONG, Yunlong SHAO, Xusheng WANG. Handwriting identification method based on multi-scale mixed domain attention mechanism [J]. Journal of Computer Applications, 2024, 44(7): 2225-2232. |
[15] | Huanhuan LI, Tianqiang HUANG, Xuemei DING, Haifeng LUO, Liqing HUANG. Public traffic demand prediction based on multi-scale spatial-temporal graph convolutional network [J]. Journal of Computer Applications, 2024, 44(7): 2065-2072. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||