Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (11): 3579-3586.DOI: 10.11772/j.issn.1001-9081.2022111660
Special Issue: 多媒体计算与计算机仿真
• Multimedia computing and computer simulation • Previous Articles Next Articles
					
						                                                                                                                                                                                                                    Qiangqiang QIN, Junguo LIAO( ), Yixun ZHOU
), Yixun ZHOU
												  
						
						
						
					
				
Received:2022-11-09
															
							
																	Revised:2023-03-03
															
							
																	Accepted:2023-03-03
															
							
							
																	Online:2023-03-20
															
							
																	Published:2023-11-10
															
							
						Contact:
								Junguo LIAO   
													About author:QIN Qiangqiang, born in 1990, M. S. candidate. His research interests include artificial intelligence, object detection.通讯作者:
					廖俊国
							作者简介:秦强强(1997—),男,安徽芜湖人,硕士研究生,CCF会员,主要研究方向:人工智能、目标检测CLC Number:
Qiangqiang QIN, Junguo LIAO, Yixun ZHOU. Small object detection algorithm based on split mixed attention[J]. Journal of Computer Applications, 2023, 43(11): 3579-3586.
秦强强, 廖俊国, 周弋荀. 基于多分支混合注意力的小目标检测算法[J]. 《计算机应用》唯一官方网站, 2023, 43(11): 3579-3586.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2022111660
| 模型 | 输入 分辨率 | 参数量/106 | 模型 大小/MB | GFLOPs | mAP50/% | FPS1 280/(frame·s-1) | 
|---|---|---|---|---|---|---|
| YOLOv5s | 640×640 | 7.02 | 56.81 | 15.8 | 32.27 | 122 | 
| 960×960 | 7.02 | 57.02 | 33.9 | 41.34 | 122 | |
| 1 280×1 280 | 7.02 | 57.31 | 57.3 | 47.92 | 122 | |
| SMAM-YOLO | 640×640 | 7.37 | 60.59 | 19.9 | 38.16 | 74 | 
| 960×960 | 7.37 | 61.44 | 42.6 | 45.82 | 74 | |
| 1 280×1 280 | 7.37 | 62.62 | 74.4 | 52.07 | 74 | 
Tab. 1 Experimental results of resolution
| 模型 | 输入 分辨率 | 参数量/106 | 模型 大小/MB | GFLOPs | mAP50/% | FPS1 280/(frame·s-1) | 
|---|---|---|---|---|---|---|
| YOLOv5s | 640×640 | 7.02 | 56.81 | 15.8 | 32.27 | 122 | 
| 960×960 | 7.02 | 57.02 | 33.9 | 41.34 | 122 | |
| 1 280×1 280 | 7.02 | 57.31 | 57.3 | 47.92 | 122 | |
| SMAM-YOLO | 640×640 | 7.37 | 60.59 | 19.9 | 38.16 | 74 | 
| 960×960 | 7.37 | 61.44 | 42.6 | 45.82 | 74 | |
| 1 280×1 280 | 7.37 | 62.62 | 74.4 | 52.07 | 74 | 
| 序号 | 基线 | P2 | SMAM | CSMAM | 模型层数 | 参数量/106 | 模型大小/MB | GFLOPs | mAP50/% | FPS/(frame·s-1) | 
|---|---|---|---|---|---|---|---|---|---|---|
| a | √ | 270 | 7.02 | 57.31 | 57.27 | 47.92 | 122 | |||
| b | √ | √ | 328 | 7.17 | 60.63 | 65.39 | 49.70 | 91 | ||
| c | √ | √ | √ | 496 | 7.62 | 64.43 | 76.16 | 51.71 | 78 | |
| d | √ | √ | √ | √ | 587 | 7.37 | 62.62 | 74.40 | 52.07 | 74 | 
Tab. 2 Ablation experimental results
| 序号 | 基线 | P2 | SMAM | CSMAM | 模型层数 | 参数量/106 | 模型大小/MB | GFLOPs | mAP50/% | FPS/(frame·s-1) | 
|---|---|---|---|---|---|---|---|---|---|---|
| a | √ | 270 | 7.02 | 57.31 | 57.27 | 47.92 | 122 | |||
| b | √ | √ | 328 | 7.17 | 60.63 | 65.39 | 49.70 | 91 | ||
| c | √ | √ | √ | 496 | 7.62 | 64.43 | 76.16 | 51.71 | 78 | |
| d | √ | √ | √ | √ | 587 | 7.37 | 62.62 | 74.40 | 52.07 | 74 | 
| 模型 | 参数 量/106 | 模型 大小/MB | GFLOPs | mAP50/% | FPS/(frame·s-1) | 
|---|---|---|---|---|---|
| CBAM | 7.23 | 60.41 | 64.56 | 50.61 | 77.02 | 
| YOLOX-S | 9.01 | 212.23 | 92.99 | 47.61 | 69.98 | 
| PP-YOLO-S | 7.91 | 59.16 | 63.37 | 48.23 | 117.08 | 
| DETR | 41.00 | 123.65 | 86.01 | 46.16 | 27.90 | 
| YOLOv7-tiny | 6.02 | 48.58 | 47.38 | 45.23 | 131.21 | 
| YOLOv5s | 7.02 | 60.28 | 60.28 | 50.02 | 63.29 | 
| SMAM-YOLO | 7.37 | 62.62 | 74.42 | 52.07 | 74.07 | 
Tab. 3 Comparison experimental results of different small object detection models
| 模型 | 参数 量/106 | 模型 大小/MB | GFLOPs | mAP50/% | FPS/(frame·s-1) | 
|---|---|---|---|---|---|
| CBAM | 7.23 | 60.41 | 64.56 | 50.61 | 77.02 | 
| YOLOX-S | 9.01 | 212.23 | 92.99 | 47.61 | 69.98 | 
| PP-YOLO-S | 7.91 | 59.16 | 63.37 | 48.23 | 117.08 | 
| DETR | 41.00 | 123.65 | 86.01 | 46.16 | 27.90 | 
| YOLOv7-tiny | 6.02 | 48.58 | 47.38 | 45.23 | 131.21 | 
| YOLOv5s | 7.02 | 60.28 | 60.28 | 50.02 | 63.29 | 
| SMAM-YOLO | 7.37 | 62.62 | 74.42 | 52.07 | 74.07 | 
| 1 | LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Washington, DC: IEEE Computer Society, 2017: 936-944. 10.1109/cvpr.2017.106 | 
| 2 | LIU S, QI L, QIN H, et al. Path aggregation network for instance segmentation[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 8759-8768. 10.1109/cvpr.2018.00913 | 
| 3 | GHIASI G, LIN T Y, LE Q V. NAS-FPN: learning scalable feature pyramid architecture for object detection[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, DC: IEEE Computer Society, 2019: 7029-7038. 10.1109/cvpr.2019.00720 | 
| 4 | LIANG Z, SHAO J, ZHANG D, et al. Small object detection using deep feature pyramid networks[C]// Proceedings of the 2018 Pacific Rim Conference on Multimedia, LNCS 11166. Cham: Springer, 2018: 554-564. | 
| 5 | TAN M, PANG R, LE Q V. EfficientDet: scalable and efficient object detection[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, DC: IEEE Computer Society, 2020: 10778-10787. 10.1109/cvpr42600.2020.01079 | 
| 6 | HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, DC: IEEE Computer Society, 2018: 7132-7141. 10.1109/cvpr.2018.00745 | 
| 7 | WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]// Proceedings of the 2018 European Conference on Computer Vision, LNCS 11211. Cham: Springer, 2018: 3-19. | 
| 8 | QIN Z, ZHANG P, WU F, et al. FcaNet: frequency channel attention networks[C]// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 763-772. 10.1109/iccv48922.2021.00082 | 
| 9 | WANG C Y, LIAO H Y M, WU Y H, et al. CSPNet: a new backbone that can enhance learning capability of CNN[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Washington, DC: IEEE Computer Society, 2020: 1571-1580. 10.1109/cvprw50498.2020.00203 | 
| 10 | 李科岑,王晓强,林浩,等. 深度学习中的单阶段小目标检测方法综述[J]. 计算机科学与探索, 2022, 16(1):41-58. 10.3778/j.issn.1673-9418.2110003 | 
| LI K C, WANG X Q, LIN H, et al. A survey of one-stage small object detection methods in deep learning[J]. Journal of Frontiers of Computer Science and Technology, 2022, 16(1): 41-58. 10.3778/j.issn.1673-9418.2110003 | |
| 11 | KISANTAL M, WOJNA Z, MURAWSKI J, et al. Augmentation for small object detection[EB/OL]. [2023-02-12].. 10.5121/csit.2019.91713 | 
| 12 | GONG Y, YU X, DING Y, et al. Effective fusion factor in FPN for tiny object detection[C]// Proceedings of the 2021 IEEE Winter Conference on Applications of Computer Vision. Piscataway: IEEE, 2021: 1159-1167. 10.1109/wacv48630.2021.00120 | 
| 13 | JIANG N, YU X, PENG X, et al. SM+: refined scale match for tiny person detection[C]// Proceedings of the 2021 IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway: IEEE, 2021: 1815-1819. 10.1109/icassp39728.2021.9414162 | 
| 14 | 李文涛,彭力. 多尺度通道注意力融合网络的小目标检测算法[J]. 计算机科学与探索, 2021, 15(12):2390-2400. | 
| LI W T, PENG L. Small objects detection algorithm with multi-scale channel attention fusion network[J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(12): 2390-2400. | |
| 15 | SZEGEDY C, LIU W, JIA Y, et al. Going deeper with convolutions[C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Washington, DC: IEEE Computer Society, 2015: 1-9. 10.1109/cvpr.2015.7298594 | 
| 16 | XIE S, GIRSHICK R, DOLLÁR P, et al. Aggregated residual transformations for deep neural networks[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Washington, DC: IEEE Computer Society, 2017: 5987-5995. 10.1109/cvpr.2017.634 | 
| 17 | LI X, WANG W, HU X, et al. Selective kernel networks[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington, DC: IEEE Computer Society, 2019: 510-519. 10.1109/cvpr.2019.00060 | 
| 18 | ZHANG H, WU C, ZHANG Z, et al. ResNeSt: split-attention networks[C]// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Washington, DC: IEEE Computer Society, 2022: 2735-2745. 10.1109/cvprw56347.2022.00309 | 
| 19 | REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]// Proceedings of the IEEE 2016 Conference on Computer Vision and Pattern Recognition. Washington, DC: IEEE Computer Society, 2016: 779-788. 10.1109/cvpr.2016.91 | 
| 20 | REN S, HE K, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[C]// Proceedings of the 28th International Conference on Neural Information Processing Systems — Volume 1. Cambridge: MIT Press, 2015:91-99. | 
| 21 | HE K, GKIOSARI G, DOLLÁR P, et al. Mask R-CNN[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 2980-2988. 10.1109/iccv.2017.322 | 
| 22 | 曹家乐,李亚利,孙汉卿,等.基于深度学习的视觉目标检测技术综述[J].中国图象图形学报,2022,27(6):1697-1722. 10.11834/jig.220069 | 
| CAO J L, LI Y L, SUN H Q, et al. A survey on deep learning based visual object detection[J]. Journal of Image and Graphics, 2022, 27(6): 1697-1722. 10.11834/jig.220069 | |
| 23 | REDMON J, FARHADI A. YOLO9000: better, faster, stronger[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Washington, DC: IEEE Computer Society, 2017: 6517-6525. 10.1109/cvpr.2017.690 | 
| 24 | REDMON J, FARHADI A. YOLOv3: an incremental improvement[EB/OL]. [2023-02-12].. 10.1109/cvpr.2017.690 | 
| 25 | BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: optimal speed and accuracy of object detection[EB/OL]. [2023-02-12].. | 
| 26 | YU X, GONG Y, JIANG N, et al. Scale match for tiny person detection[C]// Proceedings of the 2020 IEEE Winter Conference on Applications of Computer Vision. Piscataway: IEEE, 2020: 1246-1254. 10.1109/wacv45572.2020.9093394 | 
| 27 | LONG X, DENG K, WANG G, et al. PP-YOLO: an effective and efficient implementation of object detector[EB/OL]. [2023-02-12].. 10.48550/arXiv.2007.12099 | 
| 28 | ZHU X, SU W, LU L, et al. Deformable DETR: deformable transformers for end-to-end object detection[EB/OL]. [2023-02-12].. | 
| 29 | WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[EB/OL]. [2023-02-12].. 10.48550/arXiv.2207.02696 | 
| 30 | GE Z, LIU S, WANG F, et al. YOLOX: exceeding YOLO series in 2021[EB/OL]. [2023-02-12].. | 
| 31 | SELVARAJU R R, COGSWELL M, DAS A, et al. Grad-CAM: visual explanations from deep networks via gradient-based localization[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 618-626. 10.1109/iccv.2017.74 | 
| [1] | Yexin PAN, Zhe YANG. Optimization model for small object detection based on multi-level feature bidirectional fusion [J]. Journal of Computer Applications, 2024, 44(9): 2871-2877. | 
| [2] | Yeheng LI, Guangsheng LUO, Qianmin SU. Logo detection algorithm based on improved YOLOv5 [J]. Journal of Computer Applications, 2024, 44(8): 2580-2587. | 
| [3] | Ruihua LIU, Zihe HAO, Yangyang ZOU. Gait recognition algorithm based on multi-layer refined feature fusion [J]. Journal of Computer Applications, 2024, 44(7): 2250-2257. | 
| [4] | Yue LIU, Fang LIU, Aoyun WU, Qiuyue CHAI, Tianxiao WANG. 3D object detection network based on self-attention mechanism and graph convolution [J]. Journal of Computer Applications, 2024, 44(6): 1972-1977. | 
| [5] | Yaping DENG, Yingjiang LI. Review of YOLO algorithm and its applications to object detection in autonomous driving scenes [J]. Journal of Computer Applications, 2024, 44(6): 1949-1958. | 
| [6] | Mengyuan HUANG, Kan CHANG, Mingyang LING, Xinjie WEI, Tuanfa QIN. Progressive enhancement algorithm for low-light images based on layer guidance [J]. Journal of Computer Applications, 2024, 44(6): 1911-1919. | 
| [7] | Guijin HAN, Xinyuan ZHANG, Wentao ZHANG, Ya HUANG. Self-supervised image registration algorithm based on multi-feature fusion [J]. Journal of Computer Applications, 2024, 44(5): 1597-1604. | 
| [8] | Xin LI, Qiao MENG, Junyi HUANGFU, Lingchen MENG. YOLOv5 multi-attribute classification based on separable label collaborative learning [J]. Journal of Computer Applications, 2024, 44(5): 1619-1628. | 
| [9] | Hongtian LI, Xinhao SHI, Weiguo PAN, Cheng XU, Bingxin XU, Jiazheng YUAN. Few-shot object detection via fusing multi-scale and attention mechanism [J]. Journal of Computer Applications, 2024, 44(5): 1437-1444. | 
| [10] | Xinye LI, Yening HOU, Yinghui KONG, Zhiqi YAN. Few-shot object detection combining feature fusion and enhanced attention [J]. Journal of Computer Applications, 2024, 44(3): 745-751. | 
| [11] | Zhanjun JIANG, Baijing WU, Long MA, Jing LIAN. Faster-RCNN water-floating garbage recognition based on multi-scale feature and polarized self-attention [J]. Journal of Computer Applications, 2024, 44(3): 938-944. | 
| [12] | Zongze JIA, Pengfei GAO, Yinglong MA, Xiaofeng LIU, Haixin XIA. Multi-feature fusion attention-based hierarchical classification method for dialogue act [J]. Journal of Computer Applications, 2024, 44(3): 715-721. | 
| [13] | Ning WU, Yangyang LUO, Huajie XU. Semantic segmentation method for remote sensing images based on multi-scale feature fusion [J]. Journal of Computer Applications, 2024, 44(3): 737-744. | 
| [14] | Yuliang ZHENG, Yunhua CHEN, Weijie BAI, Pinghua CHEN. Vehicle target detection by fusing event data and image frames [J]. Journal of Computer Applications, 2024, 44(3): 931-937. | 
| [15] | Qiaoling HUANG, Bochuan ZHENG, Zicheng DING, Zedong WU. Improved image inpainting network incorporating supervised attention module and cross-stage feature fusion [J]. Journal of Computer Applications, 2024, 44(2): 572-579. | 
| Viewed | ||||||
| Full text |  | |||||
| Abstract |  | |||||