Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (11): 3579-3586.DOI: 10.11772/j.issn.1001-9081.2022111660
• Multimedia computing and computer simulation • Previous Articles
Qiangqiang QIN, Junguo LIAO(), Yixun ZHOU
Received:
2022-11-09
Revised:
2023-03-03
Accepted:
2023-03-03
Online:
2023-03-20
Published:
2023-11-10
Contact:
Junguo LIAO
About author:
QIN Qiangqiang, born in 1990, M. S. candidate. His research interests include artificial intelligence, object detection.通讯作者:
廖俊国
作者简介:
秦强强(1997—),男,安徽芜湖人,硕士研究生,CCF会员,主要研究方向:人工智能、目标检测CLC Number:
Qiangqiang QIN, Junguo LIAO, Yixun ZHOU. Small object detection algorithm based on split mixed attention[J]. Journal of Computer Applications, 2023, 43(11): 3579-3586.
秦强强, 廖俊国, 周弋荀. 基于多分支混合注意力的小目标检测算法[J]. 《计算机应用》唯一官方网站, 2023, 43(11): 3579-3586.
Add to citation manager EndNote|Ris|BibTeX
URL: http://www.joca.cn/EN/10.11772/j.issn.1001-9081.2022111660
模型 | 输入 分辨率 | 参数量/106 | 模型 大小/MB | GFLOPs | mAP50/% | FPS1 280/(frame·s-1) |
---|---|---|---|---|---|---|
YOLOv5s | 640×640 | 7.02 | 56.81 | 15.8 | 32.27 | 122 |
960×960 | 7.02 | 57.02 | 33.9 | 41.34 | 122 | |
1 280×1 280 | 7.02 | 57.31 | 57.3 | 47.92 | 122 | |
SMAM-YOLO | 640×640 | 7.37 | 60.59 | 19.9 | 38.16 | 74 |
960×960 | 7.37 | 61.44 | 42.6 | 45.82 | 74 | |
1 280×1 280 | 7.37 | 62.62 | 74.4 | 52.07 | 74 |
Tab. 1 Experimental results of resolution
模型 | 输入 分辨率 | 参数量/106 | 模型 大小/MB | GFLOPs | mAP50/% | FPS1 280/(frame·s-1) |
---|---|---|---|---|---|---|
YOLOv5s | 640×640 | 7.02 | 56.81 | 15.8 | 32.27 | 122 |
960×960 | 7.02 | 57.02 | 33.9 | 41.34 | 122 | |
1 280×1 280 | 7.02 | 57.31 | 57.3 | 47.92 | 122 | |
SMAM-YOLO | 640×640 | 7.37 | 60.59 | 19.9 | 38.16 | 74 |
960×960 | 7.37 | 61.44 | 42.6 | 45.82 | 74 | |
1 280×1 280 | 7.37 | 62.62 | 74.4 | 52.07 | 74 |
序号 | 基线 | P2 | SMAM | CSMAM | 模型层数 | 参数量/106 | 模型大小/MB | GFLOPs | mAP50/% | FPS/(frame·s-1) |
---|---|---|---|---|---|---|---|---|---|---|
a | √ | 270 | 7.02 | 57.31 | 57.27 | 47.92 | 122 | |||
b | √ | √ | 328 | 7.17 | 60.63 | 65.39 | 49.70 | 91 | ||
c | √ | √ | √ | 496 | 7.62 | 64.43 | 76.16 | 51.71 | 78 | |
d | √ | √ | √ | √ | 587 | 7.37 | 62.62 | 74.40 | 52.07 | 74 |
Tab. 2 Ablation experimental results
序号 | 基线 | P2 | SMAM | CSMAM | 模型层数 | 参数量/106 | 模型大小/MB | GFLOPs | mAP50/% | FPS/(frame·s-1) |
---|---|---|---|---|---|---|---|---|---|---|
a | √ | 270 | 7.02 | 57.31 | 57.27 | 47.92 | 122 | |||
b | √ | √ | 328 | 7.17 | 60.63 | 65.39 | 49.70 | 91 | ||
c | √ | √ | √ | 496 | 7.62 | 64.43 | 76.16 | 51.71 | 78 | |
d | √ | √ | √ | √ | 587 | 7.37 | 62.62 | 74.40 | 52.07 | 74 |
模型 | 参数 量/106 | 模型 大小/MB | GFLOPs | mAP50/% | FPS/(frame·s-1) |
---|---|---|---|---|---|
CBAM | 7.23 | 60.41 | 64.56 | 50.61 | 77.02 |
YOLOX-S | 9.01 | 212.23 | 92.99 | 47.61 | 69.98 |
PP-YOLO-S | 7.91 | 59.16 | 63.37 | 48.23 | 117.08 |
DETR | 41.00 | 123.65 | 86.01 | 46.16 | 27.90 |
YOLOv7-tiny | 6.02 | 48.58 | 47.38 | 45.23 | 131.21 |
YOLOv5s | 7.02 | 60.28 | 60.28 | 50.02 | 63.29 |
SMAM-YOLO | 7.37 | 62.62 | 74.42 | 52.07 | 74.07 |
Tab. 3 Comparison experimental results of different small object detection models
模型 | 参数 量/106 | 模型 大小/MB | GFLOPs | mAP50/% | FPS/(frame·s-1) |
---|---|---|---|---|---|
CBAM | 7.23 | 60.41 | 64.56 | 50.61 | 77.02 |
YOLOX-S | 9.01 | 212.23 | 92.99 | 47.61 | 69.98 |
PP-YOLO-S | 7.91 | 59.16 | 63.37 | 48.23 | 117.08 |
DETR | 41.00 | 123.65 | 86.01 | 46.16 | 27.90 |
YOLOv7-tiny | 6.02 | 48.58 | 47.38 | 45.23 | 131.21 |
YOLOv5s | 7.02 | 60.28 | 60.28 | 50.02 | 63.29 |
SMAM-YOLO | 7.37 | 62.62 | 74.42 | 52.07 | 74.07 |
1 | LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Washington, DC: IEEE Computer Society, 2017: 936-944. 10.1109/cvpr.2017.106 |
2 | LIU S, QI L, QIN H, et al. Path aggregation network for instance segmentation[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 8759-8768. 10.1109/cvpr.2018.00913 |
3 | GHIASI G, LIN T Y, LE Q V. NAS-FPN: learning scalable feature pyramid architecture for object detection[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, DC: IEEE Computer Society, 2019: 7029-7038. 10.1109/cvpr.2019.00720 |
4 | LIANG Z, SHAO J, ZHANG D, et al. Small object detection using deep feature pyramid networks[C]// Proceedings of the 2018 Pacific Rim Conference on Multimedia, LNCS 11166. Cham: Springer, 2018: 554-564. |
5 | TAN M, PANG R, LE Q V. EfficientDet: scalable and efficient object detection[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, DC: IEEE Computer Society, 2020: 10778-10787. 10.1109/cvpr42600.2020.01079 |
6 | HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, DC: IEEE Computer Society, 2018: 7132-7141. 10.1109/cvpr.2018.00745 |
7 | WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]// Proceedings of the 2018 European Conference on Computer Vision, LNCS 11211. Cham: Springer, 2018: 3-19. |
8 | QIN Z, ZHANG P, WU F, et al. FcaNet: frequency channel attention networks[C]// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 763-772. 10.1109/iccv48922.2021.00082 |
9 | WANG C Y, LIAO H Y M, WU Y H, et al. CSPNet: a new backbone that can enhance learning capability of CNN[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Washington, DC: IEEE Computer Society, 2020: 1571-1580. 10.1109/cvprw50498.2020.00203 |
10 | 李科岑,王晓强,林浩,等. 深度学习中的单阶段小目标检测方法综述[J]. 计算机科学与探索, 2022, 16(1):41-58. 10.3778/j.issn.1673-9418.2110003 |
LI K C, WANG X Q, LIN H, et al. A survey of one-stage small object detection methods in deep learning[J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(1): 41-58. 10.3778/j.issn.1673-9418.2110003 | |
11 | KISANTAL M, WOJNA Z, MURAWSKI J, et al. Augmentation for small object detection[EB/OL]. [2023-02-12].. 10.5121/csit.2019.91713 |
12 | GONG Y, YU X, DING Y, et al. Effective fusion factor in FPN for tiny object detection[C]// Proceedings of the 2021 IEEE Winter Conference on Applications of Computer Vision. Piscataway: IEEE, 2021: 1159-1167. 10.1109/wacv48630.2021.00120 |
13 | JIANG N, YU X, PENG X, et al. SM+: refined scale match for tiny person detection[C]// Proceedings of the 2021 IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway: IEEE, 2021: 1815-1819. 10.1109/icassp39728.2021.9414162 |
14 | 李文涛,彭力. 多尺度通道注意力融合网络的小目标检测算法[J]. 计算机科学与探索, 2021, 15(12):2390-2400. |
LI W T, PENG L. Small objects detection algorithm with multi-scale channel attention fusion network[J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(12): 2390-2400. | |
15 | SZEGEDY C, LIU W, JIA Y, et al. Going deeper with convolutions[C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Washington, DC: IEEE Computer Society, 2015: 1-9. 10.1109/cvpr.2015.7298594 |
16 | XIE S, GIRSHICK R, DOLLÁR P, et al. Aggregated residual transformations for deep neural networks[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Washington, DC: IEEE Computer Society, 2017: 5987-5995. 10.1109/cvpr.2017.634 |
17 | LI X, WANG W, HU X, et al. Selective kernel networks[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, DC: IEEE Computer Society, 2019: 510-519. 10.1109/cvpr.2019.00060 |
18 | ZHANG H, WU C, ZHANG Z, et al. ResNeSt: split-attention networks[C]// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Washington, DC: IEEE Computer Society, 2022: 2735-2745. 10.1109/cvprw56347.2022.00309 |
19 | REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]// Proceedings of the IEEE 2016 Conference on Computer Vision and Pattern Recognition. Washington, DC: IEEE Computer Society, 2016: 779-788. 10.1109/cvpr.2016.91 |
20 | REN S, HE K, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[C]// Proceedings of the 28th International Conference on Neural Information Processing Systems — Volume 1. Cambridge: MIT Press, 2015:91-99. |
21 | HE K, GKIOSARI G, DOLLÁR P, et al. Mask R-CNN[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 2980-2988. 10.1109/iccv.2017.322 |
22 | 曹家乐,李亚利,孙汉卿,等.基于深度学习的视觉目标检测技术综述[J].中国图象图形学报,2022,27(6):1697-1722. 10.11834/jig.220069 |
CAO J L, LI Y L, SUN H Q, et al. A survey on deep learning based visual object detection[J]. Journal of Image and Graphics, 2022, 27(6): 1697-1722. 10.11834/jig.220069 | |
23 | REDMON J, FARHADI A. YOLO9000: better, faster, stronger[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Washington, DC: IEEE Computer Society, 2017: 6517-6525. 10.1109/cvpr.2017.690 |
24 | REDMON J, FARHADI A. YOLOv3: an incremental improvement[EB/OL]. [2023-02-12].. 10.1109/cvpr.2017.690 |
25 | BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: optimal speed and accuracy of object detection[EB/OL]. [2023-02-12].. |
26 | YU X, GONG Y, JIANG N, et al. Scale match for tiny person detection[C]// Proceedings of the 2020 IEEE Winter Conference on Applications of Computer Vision. Piscataway: IEEE, 2020: 1246-1254. 10.1109/wacv45572.2020.9093394 |
27 | LONG X, DENG K, WANG G, et al. PP-YOLO: an effective and efficient implementation of object detector[EB/OL]. [2023-02-12].. 10.48550/arXiv.2007.12099 |
28 | ZHU X, SU W, LU L, et al. Deformable DETR: deformable transformers for end-to-end object detection[EB/OL]. [2023-02-12].. |
29 | WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[EB/OL]. [2023-02-12].. 10.48550/arXiv.2207.02696 |
30 | GE Z, LIU S, WANG F, et al. YOLOX: exceeding YOLO series in 2021[EB/OL]. [2023-02-12].. |
31 | SELVARAJU R R, COGSWELL M, DAS A, et al. Grad-CAM: visual explanations from deep networks via gradient-based localization[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 618-626. 10.1109/iccv.2017.74 |
[1] | Hao YANG, Yi ZHANG. Feature pyramid network algorithm based on context information and multi-scale fusion importance awareness [J]. Journal of Computer Applications, 2023, 43(9): 2727-2734. |
[2] | Huan LIU, Lianghong WU, Lyu ZHANG, Liang CHEN, Bowen ZHOU, Hongqiang ZHANG. Leukocyte detection method based on twice-fusion-feature CenterNet [J]. Journal of Computer Applications, 2023, 43(8): 2602-2610. |
[3] | Zelin XU, Min YANG, Meng CHEN. Point-of-interest category representation model with spatial and textual information [J]. Journal of Computer Applications, 2023, 43(8): 2456-2461. |
[4] | Doudou LI, Wanggen LI, Yichun XIA, Yang SHU, Kun GAO. Skeleton-based action recognition based on feature interaction and adaptive fusion [J]. Journal of Computer Applications, 2023, 43(8): 2581-2587. |
[5] | Meijia LIANG, Xinwu LIU, Xiaopeng HU. Small target detection algorithm for train operating environment image based on improved YOLOv3 [J]. Journal of Computer Applications, 2023, 43(8): 2611-2618. |
[6] | Shuai ZHENG, Xiaolong ZHANG, He DENG, Hongwei REN. 3D liver image segmentation method based on multi-scale feature fusion and grid attention mechanism [J]. Journal of Computer Applications, 2023, 43(7): 2303-2310. |
[7] | Zongzhe LYU, Hui XU, Xiao YANG, Yong WANG, Weijian WANG. Small object detection algorithm of YOLOv5 for safety helmet [J]. Journal of Computer Applications, 2023, 43(6): 1943-1949. |
[8] | Xueqiang LYU, Yunan ZHANG, Jing HAN, Yunpeng CUI, Huan LI. Table structure recognition model integrating edge features and attention [J]. Journal of Computer Applications, 2023, 43(3): 752-758. |
[9] | Ping WANG, Nan CHEN, Lei LU. Fall detection algorithm based on scene prior and attention guidance [J]. Journal of Computer Applications, 2023, 43(2): 529-535. |
[10] | Gang CHEN, Yongwei LIAO, Zhenguo YANG, Wenying LIU. Image inpainting algorithm of multi-scale generative adversarial network based on multi-feature fusion [J]. Journal of Computer Applications, 2023, 43(2): 536-544. |
[11] | Wenju LI, Gan ZHANG, Liu CUI, Wanghui CHU. Lightweight traffic sign recognition model based on coordinate attention [J]. Journal of Computer Applications, 2023, 43(2): 608-614. |
[12] | Shuying YANG, Haiming GUO, Xin LI. EEG classification based on channel selection and multi-dimensional feature fusion [J]. Journal of Computer Applications, 2023, 43(11): 3418-3427. |
[13] | Suolan LIU, Zhenzhen TIAN, Hongyuan WANG, Long LIN, Yan WANG. Human action recognition method based on multi-scale feature fusion of single mode [J]. Journal of Computer Applications, 2023, 43(10): 3236-3243. |
[14] | Wen HAO, Yang WANG, Hainan WEI. Semantic segmentation of point cloud scenes based on multi-feature fusion [J]. Journal of Computer Applications, 2023, 43(10): 3202-3208. |
[15] | YANG Honggang, CHEN Jiejie, XU Mengfei. Bilinear involution neural network for image classification of fundus diseases [J]. Journal of Computer Applications, 2023, 43(1): 259-264. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||