Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (3): 916-922.DOI: 10.11772/j.issn.1001-9081.2022010071
Special Issue: 多媒体计算与计算机仿真
• Multimedia computing and computer simulation • Previous Articles Next Articles
					
						                                                                                                                                                                                                                                                    Yongxiang GU1,2, Xin LAN1,2, Boyi FU1,2, Xiaolin QIN1,2( )
)
												  
						
						
						
					
				
Received:2022-01-19
															
							
																	Revised:2022-03-01
															
							
																	Accepted:2022-03-07
															
							
							
																	Online:2022-03-11
															
							
																	Published:2023-03-10
															
							
						Contact:
								Xiaolin QIN   
													About author:GU Yongxiang, born in 1997, M. S. candidate. His research interests include deep learning, object detection.Supported by:
        
                   
            顾勇翔1,2, 蓝鑫1,2, 伏博毅1,2, 秦小林1,2( )
)
                  
        
        
        
        
    
通讯作者:
					秦小林
							作者简介:顾勇翔(1997—),男,江苏苏州人,硕士研究生,CCF会员,主要研究方向:深度学习、目标检测基金资助:CLC Number:
Yongxiang GU, Xin LAN, Boyi FU, Xiaolin QIN. Object detection algorithm for remote sensing images based on geometric adaptation and global perception[J]. Journal of Computer Applications, 2023, 43(3): 916-922.
顾勇翔, 蓝鑫, 伏博毅, 秦小林. 基于几何适应与全局感知的遥感图像目标检测算法[J]. 《计算机应用》唯一官方网站, 2023, 43(3): 916-922.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2022010071
| 数据集 | 算法 | 参数量/106 | 浮点运算量/GFLOPs | 类别 | P/% | R/% | AP50/% | mAP/% | 
|---|---|---|---|---|---|---|---|---|
| UCAS-AOD | YOLOv3-SPP | 62.6 | 155.8 | 汽车 | 91.3 | 93.1 | 92.9 | 56.5 | 
| 飞机 | 98.6 | 98.7 | 99.3 | 72.9 | ||||
| 平均 | 95.0 | 95.9 | 96.1 | 64.7 | ||||
| YOLOv5s6 | 12.4 | 16.8 | 汽车 | 89.7 | 92.4 | 91.8 | 55.1 | |
| 飞机 | 98.1 | 98.4 | 99.3 | 72.9 | ||||
| 平均 | 93.9 | 95.1 | 95.5 | 64.0 | ||||
| 本文算法 | 13.7 | 17.0 | 汽车 | 90.0 | 92.5 | 94.2 | 58.3 | |
| 飞机 | 97.6 | 98.3 | 99.3 | 73.4 | ||||
| 平均 | 93.8 | 95.4 | 96.7 | 65.8 | ||||
| RSOD | YOLOv3-SPP | 62.6 | 155.8 | 飞机 | 89.9 | 90.1 | 94.1 | 61.8 | 
| 油罐 | 95.1 | 98.1 | 98.4 | 77.4 | ||||
| 立交桥 | 86.6 | 71.8 | 78.2 | 36.5 | ||||
| 操场 | 82.1 | 100.0 | 99.1 | 85.2 | ||||
| 平均 | 88.4 | 90.0 | 92.4 | 65.2 | ||||
| YOLOv5s6 | 12.4 | 16.8 | 飞机 | 97.3 | 82.9 | 93.9 | 64.3 | |
| 油罐 | 100.0 | 93.4 | 98.6 | 78.9 | ||||
| 立交桥 | 84.6 | 61.1 | 66.7 | 30.3 | ||||
| 操场 | 99.5 | 100.0 | 99.5 | 87.0 | ||||
| 平均 | 95.3 | 84.4 | 89.7 | 65.1 | ||||
| 本文算法 | 13.7 | 17.0 | 飞机 | 96.1 | 87.6 | 94.4 | 65.5 | |
| 油罐 | 99.5 | 94.7 | 97.8 | 80.1 | ||||
| 立交桥 | 80.0 | 66.7 | 66.9 | 33.4 | ||||
| 操场 | 86.5 | 100.0 | 99.5 | 87.3 | ||||
| 平均 | 90.5 | 87.2 | 89.6 | 66.6 | 
Tab. 1 Comparison of detection results of different algorithms on the UCAS-AOD and RSOD datasets
| 数据集 | 算法 | 参数量/106 | 浮点运算量/GFLOPs | 类别 | P/% | R/% | AP50/% | mAP/% | 
|---|---|---|---|---|---|---|---|---|
| UCAS-AOD | YOLOv3-SPP | 62.6 | 155.8 | 汽车 | 91.3 | 93.1 | 92.9 | 56.5 | 
| 飞机 | 98.6 | 98.7 | 99.3 | 72.9 | ||||
| 平均 | 95.0 | 95.9 | 96.1 | 64.7 | ||||
| YOLOv5s6 | 12.4 | 16.8 | 汽车 | 89.7 | 92.4 | 91.8 | 55.1 | |
| 飞机 | 98.1 | 98.4 | 99.3 | 72.9 | ||||
| 平均 | 93.9 | 95.1 | 95.5 | 64.0 | ||||
| 本文算法 | 13.7 | 17.0 | 汽车 | 90.0 | 92.5 | 94.2 | 58.3 | |
| 飞机 | 97.6 | 98.3 | 99.3 | 73.4 | ||||
| 平均 | 93.8 | 95.4 | 96.7 | 65.8 | ||||
| RSOD | YOLOv3-SPP | 62.6 | 155.8 | 飞机 | 89.9 | 90.1 | 94.1 | 61.8 | 
| 油罐 | 95.1 | 98.1 | 98.4 | 77.4 | ||||
| 立交桥 | 86.6 | 71.8 | 78.2 | 36.5 | ||||
| 操场 | 82.1 | 100.0 | 99.1 | 85.2 | ||||
| 平均 | 88.4 | 90.0 | 92.4 | 65.2 | ||||
| YOLOv5s6 | 12.4 | 16.8 | 飞机 | 97.3 | 82.9 | 93.9 | 64.3 | |
| 油罐 | 100.0 | 93.4 | 98.6 | 78.9 | ||||
| 立交桥 | 84.6 | 61.1 | 66.7 | 30.3 | ||||
| 操场 | 99.5 | 100.0 | 99.5 | 87.0 | ||||
| 平均 | 95.3 | 84.4 | 89.7 | 65.1 | ||||
| 本文算法 | 13.7 | 17.0 | 飞机 | 96.1 | 87.6 | 94.4 | 65.5 | |
| 油罐 | 99.5 | 94.7 | 97.8 | 80.1 | ||||
| 立交桥 | 80.0 | 66.7 | 66.9 | 33.4 | ||||
| 操场 | 86.5 | 100.0 | 99.5 | 87.3 | ||||
| 平均 | 90.5 | 87.2 | 89.6 | 66.6 | 
| Transformer | CAM | DenseCAM | P/% | R/% | AP50/% | mAP/% | 参数量/106 | 浮点运算量/GFLOPs | 
|---|---|---|---|---|---|---|---|---|
| — | — | — | 93.9 | 95.1 | 95.5 | 64.0 | 12.4 | 16.8 | 
| √ | — | — | 94.8 | 94.6 | 95.8 | 64.6 | 12.4 | 16.7 | 
| — | √ | — | 95.1 | 95.3 | 96.5 | 65.0 | 16.7 | 17.6 | 
| — | — | √ | 95.3 | 95.2 | 96.5 | 65.1 | 13.7 | 17.1 | 
| √ | — | √ | 93.8 | 95.4 | 96.7 | 65.8 | 13.7 | 17.0 | 
Tab. 2 Results of ablation study on UCAS-AOD dataset
| Transformer | CAM | DenseCAM | P/% | R/% | AP50/% | mAP/% | 参数量/106 | 浮点运算量/GFLOPs | 
|---|---|---|---|---|---|---|---|---|
| — | — | — | 93.9 | 95.1 | 95.5 | 64.0 | 12.4 | 16.8 | 
| √ | — | — | 94.8 | 94.6 | 95.8 | 64.6 | 12.4 | 16.7 | 
| — | √ | — | 95.1 | 95.3 | 96.5 | 65.0 | 16.7 | 17.6 | 
| — | — | √ | 95.3 | 95.2 | 96.5 | 65.1 | 13.7 | 17.1 | 
| √ | — | √ | 93.8 | 95.4 | 96.7 | 65.8 | 13.7 | 17.0 | 
| 1 | KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[C]// Proceedings of the 25th International Conference on Neural Information Processing Systems. Red Hook, NY: Curran Associates Inc., 2012: 1097-1105. | 
| 2 | HU J, SHEN L, SUN G, et al. Squeeze-and-excitation networks[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 7132-7141. 10.1109/cvpr.2018.00745 | 
| 3 | TAN M X, PANG R M, LE Q V. EfficientDet: scalable and efficient object detection[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 10778-10787. 10.1109/cvpr42600.2020.01079 | 
| 4 | TAN J R, ZHANG G, DENG H M, et al. 1st place solution of LVIS Challenge 2020: a good box is not a guarantee of a good mask[EB/OL]. (2020-09-03) [2022-02-20].. | 
| 5 | LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: common objects in context[C]// Proceedings of the 2014 European Conference on Computer Vision, LNCS 8693. Cham: Springer, 2014: 740-755. | 
| 6 | LIU Z, HU H, LIN Y T, et al. Swin Transformer V2: scaling up capacity and resolution[C]// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 11999-12009. 10.1109/cvpr52688.2022.01170 | 
| 7 | GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]// Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2014: 580-587. 10.1109/cvpr.2014.81 | 
| 8 | REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]// Proceedings of the 2016 IEEE/CVF Conference on Computer Vision and Pattern Recognition Piscataway: IEEE, 2016: 779-788. 10.1109/cvpr.2016.91 | 
| 9 | REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. 10.1109/tpami.2016.2577031 | 
| 10 | JOCHER G. v5.0 -- YOLO v5-P6 1280 models, AWS, Supervisely and YouTube integrations[EB/OL] (2021-04-12) [2022-02-20]. . | 
| 11 | WANG C Y, LIAO H Y M, WU Y H, et al. CSPNet: a new backbone that can enhance learning capability of CNN[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 1571-1580. 10.1109/cvprw50498.2020.00203 | 
| 12 | HE K M, ZHANG X Y, REN S Q, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904-1916. 10.1109/tpami.2015.2389824 | 
| 13 | LIU S, QI L, QIN H F, et al. Path aggregation network for instance segmentation[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 8759-8768. 10.1109/cvpr.2018.00913 | 
| 14 | REZATOFIGHI H, TSOI N, GWAK J, et al. Generalized intersection over union: a metric and a loss for bounding box regression[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 658-666. 10.1109/cvpr.2019.00075 | 
| 15 | ZHANG H Y, CISSE M, DAUPHIN Y N, et al. mixup: Beyond empirical risk minimization[EB/OL]. (2018-04-27) [2022-02-20].. | 
| 16 | BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: optimal speed and accuracy of object detection[EB/OL]. (2020-04-23) [2022-02-20].. | 
| 17 | ELFWING S, UCHIBE E, DOYA K. Sigmoid-weighted linear units for neural network function approximation in reinforcement learning[J]. Neural Networks, 2018, 107: 3-11. 10.1016/j.neunet.2017.12.012 | 
| 18 | 高鑫,李慧,张义,等. 基于可变形卷积神经网络的遥感影像密集区域车辆检测方法[J]. 电子与信息学报, 2018, 40(12):2812-2819. 10.11999/JEIT180209 | 
| GAO X, LI H, ZHANG Y, et al. Vehicle detection in remote sensing images of dense areas based on deformable convolution neural network[J]. Journal of Electronics and Information Technology, 2018, 40(12): 2812-2819. 10.11999/JEIT180209 | |
| 19 | DAI J F, QI H Z, XIONG Y W, et al. Deformable convolutional networks[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 764-773. 10.1109/iccv.2017.89 | 
| 20 | 胡滔. 基于深度特征增强的光学遥感目标检测技术研究[D]. 西安:西安电子科技大学, 2019:24-45. | 
| HU T. Research on optical remote sensing object detection technology based on deep feature enhancement[D]. Xi’an: Xidian University, 2019:24-45. | |
| 21 | 田婷婷,杨军. 基于多尺度特征融合网络的遥感影像目标检测[J]. 激光与光电子学进展, 2022, 59(16):427-435. | 
| TIAN T T, YANG J. Object detection for remote sensing image based on multiscale feature fusion network[J]. Laser and Optoelectronics Progress, 2022, 59(16):427-435. | |
| 22 | LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]// Proceedings of the 2017 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 936-944. 10.1109/cvpr.2017.106 | 
| 23 | XU Y L, ZHU M M, XIN P, et al. Rapid airplane detection in remote sensing images based on multilayer feature fusion in fully convolutional neural networks[J]. Sensors, 2018, 18(7): No.2335. 10.3390/s18072335 | 
| 24 | 汪亚妮,汪西莉. 基于注意力和特征融合的遥感图像目标检测模型[J]. 激光与光电子学进展, 2021, 58(2):363-371. 10.3788/LOP202158.0228003 | 
| WANG Y N, WANG X L. Remote sensing image target detection model based on attention and feature fusion[J]. Laser and Optoelectronics Progress, 2021, 58(2): 363-371. 10.3788/LOP202158.0228003 | |
| 25 | ZHU X Z, HU H, LIN S, et al. Deformable ConvNets v2: more deformable, better results[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 9300-9308. 10.1109/cvpr.2019.00953 | 
| 26 | WOO S, PARK J, LEE J Y, et al. CBAM: Convolutional Block Attention Module [C]// Proceedings of the 2018 European Conference on Computer Vision. Cham: Springer, 2018: 3-19. 10.1007/978-3-030-01234-2_1 | 
| 27 | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook, NY: Curran Associates Inc., 2017: 6000-6010. | 
| 28 | ZHU H G, CHEN X G, DAI W Q, et al. Orientation robust object detection in aerial images using deep convolutional neural network[C]// Proceedings of the 2015 IEEE International Conference on Image Processing. Piscataway: IEEE, 2015: 3735-3739. 10.1109/icip.2015.7351502 | 
| 29 | LONG Y, GONG Y P, XIAO Z F, et al. Accurate object localization in remote sensing images based on convolutional neural networks[J]. IEEE Transactions on Geoscience and Remote Sensing, 2017, 55(5): 2486-2498. 10.1109/tgrs.2016.2645610 | 
| 30 | 李婕,周顺,朱鑫潮,等. 结合多通道注意力的遥感图像飞机目标检测[J]. 计算机工程与应用, 2022, 58(1):209-217. 10.3778/j.issn.1002-8331.2107-0379 | 
| LI J, ZHOU S, ZHU X C, et al. Remote sensing image aircraft target detection combined with multiple channel attention[J]. Computer Engineering and Applications, 2022, 58(1):209-217. 10.3778/j.issn.1002-8331.2107-0379 | 
| [1] | Yexin PAN, Zhe YANG. Optimization model for small object detection based on multi-level feature bidirectional fusion [J]. Journal of Computer Applications, 2024, 44(9): 2871-2877. | 
| [2] | Jinjin LI, Guoming SANG, Yijia ZHANG. Multi-domain fake news detection model enhanced by APK-CNN and Transformer [J]. Journal of Computer Applications, 2024, 44(9): 2674-2682. | 
| [3] | Liehong REN, Lyuwen HUANG, Xu TIAN, Fei DUAN. Multivariate long-term series forecasting method with DFT-based frequency-sensitive dual-branch Transformer [J]. Journal of Computer Applications, 2024, 44(9): 2739-2746. | 
| [4] | Jieru JIA, Jianchao YANG, Shuorui ZHANG, Tao YAN, Bin CHEN. Unsupervised person re-identification based on self-distilled vision Transformer [J]. Journal of Computer Applications, 2024, 44(9): 2893-2902. | 
| [5] | Yunchuan HUANG, Yongquan JIANG, Juntao HUANG, Yan YANG. Molecular toxicity prediction based on meta graph isomorphism network [J]. Journal of Computer Applications, 2024, 44(9): 2964-2969. | 
| [6] | Xin YANG, Xueni CHEN, Chunjiang WU, Shijie ZHOU. Short-term traffic flow prediction of urban highway based on variant residual model and Transformer [J]. Journal of Computer Applications, 2024, 44(9): 2947-2951. | 
| [7] | Jiepo FANG, Chongben TAO. Hybrid internet of vehicles intrusion detection system for zero-day attacks [J]. Journal of Computer Applications, 2024, 44(9): 2763-2769. | 
| [8] | Yuwei DING, Hongbo SHI, Jie LI, Min LIANG. Image denoising network based on local and global feature decoupling [J]. Journal of Computer Applications, 2024, 44(8): 2571-2579. | 
| [9] | Kaili DENG, Weibo WEI, Zhenkuan PAN. Industrial defect detection method with improved masked autoencoder [J]. Journal of Computer Applications, 2024, 44(8): 2595-2603. | 
| [10] | Chenqian LI, Jun LIU. Ultrasound carotid plaque segmentation method based on semi-supervision and multi-scale cascaded attention [J]. Journal of Computer Applications, 2024, 44(8): 2604-2610. | 
| [11] | Yanjie GU, Yingjun ZHANG, Xiaoqian LIU, Wei ZHOU, Wei SUN. Traffic flow forecasting via spatial-temporal multi-graph fusion [J]. Journal of Computer Applications, 2024, 44(8): 2618-2625. | 
| [12] | Fan YANG, Yao ZOU, Mingzhi ZHU, Zhenwei MA, Dawei CHENG, Changjun JIANG. Credit card fraud detection model based on graph attention Transformation neural network [J]. Journal of Computer Applications, 2024, 44(8): 2634-2642. | 
| [13] | Yeheng LI, Guangsheng LUO, Qianmin SU. Logo detection algorithm based on improved YOLOv5 [J]. Journal of Computer Applications, 2024, 44(8): 2580-2587. | 
| [14] | Yingjun ZHANG, Niuniu LI, Binhong XIE, Rui ZHANG, Wangdong LU. Semi-supervised object detection framework guided by curriculum learning [J]. Journal of Computer Applications, 2024, 44(8): 2326-2333. | 
| [15] | Yuan TANG, Yanping CHEN, Ying HU, Ruizhang HUANG, Yongbin QIN. Relation extraction model based on multi-scale hybrid attention convolutional neural networks [J]. Journal of Computer Applications, 2024, 44(7): 2011-2017. | 
| Viewed | ||||||
| Full text |  | |||||
| Abstract |  | |||||