Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (11): 3579-3586.DOI: 10.11772/j.issn.1001-9081.2022111660
Special Issue: 多媒体计算与计算机仿真
• Multimedia computing and computer simulation • Previous Articles Next Articles
Qiangqiang QIN, Junguo LIAO(), Yixun ZHOU
Received:
2022-11-09
Revised:
2023-03-03
Accepted:
2023-03-03
Online:
2023-03-20
Published:
2023-11-10
Contact:
Junguo LIAO
About author:
QIN Qiangqiang, born in 1990, M. S. candidate. His research interests include artificial intelligence, object detection.通讯作者:
廖俊国
作者简介:
秦强强(1997—),男,安徽芜湖人,硕士研究生,CCF会员,主要研究方向:人工智能、目标检测CLC Number:
Qiangqiang QIN, Junguo LIAO, Yixun ZHOU. Small object detection algorithm based on split mixed attention[J]. Journal of Computer Applications, 2023, 43(11): 3579-3586.
秦强强, 廖俊国, 周弋荀. 基于多分支混合注意力的小目标检测算法[J]. 《计算机应用》唯一官方网站, 2023, 43(11): 3579-3586.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2022111660
模型 | 输入 分辨率 | 参数量/106 | 模型 大小/MB | GFLOPs | mAP50/% | FPS1 280/(frame·s-1) |
---|---|---|---|---|---|---|
YOLOv5s | 640×640 | 7.02 | 56.81 | 15.8 | 32.27 | 122 |
960×960 | 7.02 | 57.02 | 33.9 | 41.34 | 122 | |
1 280×1 280 | 7.02 | 57.31 | 57.3 | 47.92 | 122 | |
SMAM-YOLO | 640×640 | 7.37 | 60.59 | 19.9 | 38.16 | 74 |
960×960 | 7.37 | 61.44 | 42.6 | 45.82 | 74 | |
1 280×1 280 | 7.37 | 62.62 | 74.4 | 52.07 | 74 |
Tab. 1 Experimental results of resolution
模型 | 输入 分辨率 | 参数量/106 | 模型 大小/MB | GFLOPs | mAP50/% | FPS1 280/(frame·s-1) |
---|---|---|---|---|---|---|
YOLOv5s | 640×640 | 7.02 | 56.81 | 15.8 | 32.27 | 122 |
960×960 | 7.02 | 57.02 | 33.9 | 41.34 | 122 | |
1 280×1 280 | 7.02 | 57.31 | 57.3 | 47.92 | 122 | |
SMAM-YOLO | 640×640 | 7.37 | 60.59 | 19.9 | 38.16 | 74 |
960×960 | 7.37 | 61.44 | 42.6 | 45.82 | 74 | |
1 280×1 280 | 7.37 | 62.62 | 74.4 | 52.07 | 74 |
序号 | 基线 | P2 | SMAM | CSMAM | 模型层数 | 参数量/106 | 模型大小/MB | GFLOPs | mAP50/% | FPS/(frame·s-1) |
---|---|---|---|---|---|---|---|---|---|---|
a | √ | 270 | 7.02 | 57.31 | 57.27 | 47.92 | 122 | |||
b | √ | √ | 328 | 7.17 | 60.63 | 65.39 | 49.70 | 91 | ||
c | √ | √ | √ | 496 | 7.62 | 64.43 | 76.16 | 51.71 | 78 | |
d | √ | √ | √ | √ | 587 | 7.37 | 62.62 | 74.40 | 52.07 | 74 |
Tab. 2 Ablation experimental results
序号 | 基线 | P2 | SMAM | CSMAM | 模型层数 | 参数量/106 | 模型大小/MB | GFLOPs | mAP50/% | FPS/(frame·s-1) |
---|---|---|---|---|---|---|---|---|---|---|
a | √ | 270 | 7.02 | 57.31 | 57.27 | 47.92 | 122 | |||
b | √ | √ | 328 | 7.17 | 60.63 | 65.39 | 49.70 | 91 | ||
c | √ | √ | √ | 496 | 7.62 | 64.43 | 76.16 | 51.71 | 78 | |
d | √ | √ | √ | √ | 587 | 7.37 | 62.62 | 74.40 | 52.07 | 74 |
模型 | 参数 量/106 | 模型 大小/MB | GFLOPs | mAP50/% | FPS/(frame·s-1) |
---|---|---|---|---|---|
CBAM | 7.23 | 60.41 | 64.56 | 50.61 | 77.02 |
YOLOX-S | 9.01 | 212.23 | 92.99 | 47.61 | 69.98 |
PP-YOLO-S | 7.91 | 59.16 | 63.37 | 48.23 | 117.08 |
DETR | 41.00 | 123.65 | 86.01 | 46.16 | 27.90 |
YOLOv7-tiny | 6.02 | 48.58 | 47.38 | 45.23 | 131.21 |
YOLOv5s | 7.02 | 60.28 | 60.28 | 50.02 | 63.29 |
SMAM-YOLO | 7.37 | 62.62 | 74.42 | 52.07 | 74.07 |
Tab. 3 Comparison experimental results of different small object detection models
模型 | 参数 量/106 | 模型 大小/MB | GFLOPs | mAP50/% | FPS/(frame·s-1) |
---|---|---|---|---|---|
CBAM | 7.23 | 60.41 | 64.56 | 50.61 | 77.02 |
YOLOX-S | 9.01 | 212.23 | 92.99 | 47.61 | 69.98 |
PP-YOLO-S | 7.91 | 59.16 | 63.37 | 48.23 | 117.08 |
DETR | 41.00 | 123.65 | 86.01 | 46.16 | 27.90 |
YOLOv7-tiny | 6.02 | 48.58 | 47.38 | 45.23 | 131.21 |
YOLOv5s | 7.02 | 60.28 | 60.28 | 50.02 | 63.29 |
SMAM-YOLO | 7.37 | 62.62 | 74.42 | 52.07 | 74.07 |
1 | LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Washington, DC: IEEE Computer Society, 2017: 936-944. 10.1109/cvpr.2017.106 |
2 | LIU S, QI L, QIN H, et al. Path aggregation network for instance segmentation[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 8759-8768. 10.1109/cvpr.2018.00913 |
3 | GHIASI G, LIN T Y, LE Q V. NAS-FPN: learning scalable feature pyramid architecture for object detection[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, DC: IEEE Computer Society, 2019: 7029-7038. 10.1109/cvpr.2019.00720 |
4 | LIANG Z, SHAO J, ZHANG D, et al. Small object detection using deep feature pyramid networks[C]// Proceedings of the 2018 Pacific Rim Conference on Multimedia, LNCS 11166. Cham: Springer, 2018: 554-564. |
5 | TAN M, PANG R, LE Q V. EfficientDet: scalable and efficient object detection[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, DC: IEEE Computer Society, 2020: 10778-10787. 10.1109/cvpr42600.2020.01079 |
6 | HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, DC: IEEE Computer Society, 2018: 7132-7141. 10.1109/cvpr.2018.00745 |
7 | WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]// Proceedings of the 2018 European Conference on Computer Vision, LNCS 11211. Cham: Springer, 2018: 3-19. |
8 | QIN Z, ZHANG P, WU F, et al. FcaNet: frequency channel attention networks[C]// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 763-772. 10.1109/iccv48922.2021.00082 |
9 | WANG C Y, LIAO H Y M, WU Y H, et al. CSPNet: a new backbone that can enhance learning capability of CNN[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Washington, DC: IEEE Computer Society, 2020: 1571-1580. 10.1109/cvprw50498.2020.00203 |
10 | 李科岑,王晓强,林浩,等. 深度学习中的单阶段小目标检测方法综述[J]. 计算机科学与探索, 2022, 16(1):41-58. 10.3778/j.issn.1673-9418.2110003 |
LI K C, WANG X Q, LIN H, et al. A survey of one-stage small object detection methods in deep learning[J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(1): 41-58. 10.3778/j.issn.1673-9418.2110003 | |
11 | KISANTAL M, WOJNA Z, MURAWSKI J, et al. Augmentation for small object detection[EB/OL]. [2023-02-12].. 10.5121/csit.2019.91713 |
12 | GONG Y, YU X, DING Y, et al. Effective fusion factor in FPN for tiny object detection[C]// Proceedings of the 2021 IEEE Winter Conference on Applications of Computer Vision. Piscataway: IEEE, 2021: 1159-1167. 10.1109/wacv48630.2021.00120 |
13 | JIANG N, YU X, PENG X, et al. SM+: refined scale match for tiny person detection[C]// Proceedings of the 2021 IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway: IEEE, 2021: 1815-1819. 10.1109/icassp39728.2021.9414162 |
14 | 李文涛,彭力. 多尺度通道注意力融合网络的小目标检测算法[J]. 计算机科学与探索, 2021, 15(12):2390-2400. |
LI W T, PENG L. Small objects detection algorithm with multi-scale channel attention fusion network[J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(12): 2390-2400. | |
15 | SZEGEDY C, LIU W, JIA Y, et al. Going deeper with convolutions[C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Washington, DC: IEEE Computer Society, 2015: 1-9. 10.1109/cvpr.2015.7298594 |
16 | XIE S, GIRSHICK R, DOLLÁR P, et al. Aggregated residual transformations for deep neural networks[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Washington, DC: IEEE Computer Society, 2017: 5987-5995. 10.1109/cvpr.2017.634 |
17 | LI X, WANG W, HU X, et al. Selective kernel networks[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, DC: IEEE Computer Society, 2019: 510-519. 10.1109/cvpr.2019.00060 |
18 | ZHANG H, WU C, ZHANG Z, et al. ResNeSt: split-attention networks[C]// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Washington, DC: IEEE Computer Society, 2022: 2735-2745. 10.1109/cvprw56347.2022.00309 |
19 | REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]// Proceedings of the IEEE 2016 Conference on Computer Vision and Pattern Recognition. Washington, DC: IEEE Computer Society, 2016: 779-788. 10.1109/cvpr.2016.91 |
20 | REN S, HE K, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[C]// Proceedings of the 28th International Conference on Neural Information Processing Systems — Volume 1. Cambridge: MIT Press, 2015:91-99. |
21 | HE K, GKIOSARI G, DOLLÁR P, et al. Mask R-CNN[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 2980-2988. 10.1109/iccv.2017.322 |
22 | 曹家乐,李亚利,孙汉卿,等.基于深度学习的视觉目标检测技术综述[J].中国图象图形学报,2022,27(6):1697-1722. 10.11834/jig.220069 |
CAO J L, LI Y L, SUN H Q, et al. A survey on deep learning based visual object detection[J]. Journal of Image and Graphics, 2022, 27(6): 1697-1722. 10.11834/jig.220069 | |
23 | REDMON J, FARHADI A. YOLO9000: better, faster, stronger[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Washington, DC: IEEE Computer Society, 2017: 6517-6525. 10.1109/cvpr.2017.690 |
24 | REDMON J, FARHADI A. YOLOv3: an incremental improvement[EB/OL]. [2023-02-12].. 10.1109/cvpr.2017.690 |
25 | BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: optimal speed and accuracy of object detection[EB/OL]. [2023-02-12].. |
26 | YU X, GONG Y, JIANG N, et al. Scale match for tiny person detection[C]// Proceedings of the 2020 IEEE Winter Conference on Applications of Computer Vision. Piscataway: IEEE, 2020: 1246-1254. 10.1109/wacv45572.2020.9093394 |
27 | LONG X, DENG K, WANG G, et al. PP-YOLO: an effective and efficient implementation of object detector[EB/OL]. [2023-02-12].. 10.48550/arXiv.2007.12099 |
28 | ZHU X, SU W, LU L, et al. Deformable DETR: deformable transformers for end-to-end object detection[EB/OL]. [2023-02-12].. |
29 | WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[EB/OL]. [2023-02-12].. 10.48550/arXiv.2207.02696 |
30 | GE Z, LIU S, WANG F, et al. YOLOX: exceeding YOLO series in 2021[EB/OL]. [2023-02-12].. |
31 | SELVARAJU R R, COGSWELL M, DAS A, et al. Grad-CAM: visual explanations from deep networks via gradient-based localization[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 618-626. 10.1109/iccv.2017.74 |
[1] | Yexin PAN, Zhe YANG. Optimization model for small object detection based on multi-level feature bidirectional fusion [J]. Journal of Computer Applications, 2024, 44(9): 2871-2877. |
[2] | Yeheng LI, Guangsheng LUO, Qianmin SU. Logo detection algorithm based on improved YOLOv5 [J]. Journal of Computer Applications, 2024, 44(8): 2580-2587. |
[3] | Ruihua LIU, Zihe HAO, Yangyang ZOU. Gait recognition algorithm based on multi-layer refined feature fusion [J]. Journal of Computer Applications, 2024, 44(7): 2250-2257. |
[4] | Yue LIU, Fang LIU, Aoyun WU, Qiuyue CHAI, Tianxiao WANG. 3D object detection network based on self-attention mechanism and graph convolution [J]. Journal of Computer Applications, 2024, 44(6): 1972-1977. |
[5] | Yaping DENG, Yingjiang LI. Review of YOLO algorithm and its applications to object detection in autonomous driving scenes [J]. Journal of Computer Applications, 2024, 44(6): 1949-1958. |
[6] | Mengyuan HUANG, Kan CHANG, Mingyang LING, Xinjie WEI, Tuanfa QIN. Progressive enhancement algorithm for low-light images based on layer guidance [J]. Journal of Computer Applications, 2024, 44(6): 1911-1919. |
[7] | Guijin HAN, Xinyuan ZHANG, Wentao ZHANG, Ya HUANG. Self-supervised image registration algorithm based on multi-feature fusion [J]. Journal of Computer Applications, 2024, 44(5): 1597-1604. |
[8] | Xin LI, Qiao MENG, Junyi HUANGFU, Lingchen MENG. YOLOv5 multi-attribute classification based on separable label collaborative learning [J]. Journal of Computer Applications, 2024, 44(5): 1619-1628. |
[9] | Hongtian LI, Xinhao SHI, Weiguo PAN, Cheng XU, Bingxin XU, Jiazheng YUAN. Few-shot object detection via fusing multi-scale and attention mechanism [J]. Journal of Computer Applications, 2024, 44(5): 1437-1444. |
[10] | Xinye LI, Yening HOU, Yinghui KONG, Zhiqi YAN. Few-shot object detection combining feature fusion and enhanced attention [J]. Journal of Computer Applications, 2024, 44(3): 745-751. |
[11] | Zhanjun JIANG, Baijing WU, Long MA, Jing LIAN. Faster-RCNN water-floating garbage recognition based on multi-scale feature and polarized self-attention [J]. Journal of Computer Applications, 2024, 44(3): 938-944. |
[12] | Zongze JIA, Pengfei GAO, Yinglong MA, Xiaofeng LIU, Haixin XIA. Multi-feature fusion attention-based hierarchical classification method for dialogue act [J]. Journal of Computer Applications, 2024, 44(3): 715-721. |
[13] | Ning WU, Yangyang LUO, Huajie XU. Semantic segmentation method for remote sensing images based on multi-scale feature fusion [J]. Journal of Computer Applications, 2024, 44(3): 737-744. |
[14] | Yuliang ZHENG, Yunhua CHEN, Weijie BAI, Pinghua CHEN. Vehicle target detection by fusing event data and image frames [J]. Journal of Computer Applications, 2024, 44(3): 931-937. |
[15] | Qiaoling HUANG, Bochuan ZHENG, Zicheng DING, Zedong WU. Improved image inpainting network incorporating supervised attention module and cross-stage feature fusion [J]. Journal of Computer Applications, 2024, 44(2): 572-579. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||