Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (3): 916-922.DOI: 10.11772/j.issn.1001-9081.2022010071
Special Issue: 多媒体计算与计算机仿真
• Multimedia computing and computer simulation • Previous Articles Next Articles
Yongxiang GU1,2, Xin LAN1,2, Boyi FU1,2, Xiaolin QIN1,2()
Received:
2022-01-19
Revised:
2022-03-01
Accepted:
2022-03-07
Online:
2022-03-11
Published:
2023-03-10
Contact:
Xiaolin QIN
About author:
GU Yongxiang, born in 1997, M. S. candidate. His research interests include deep learning, object detection.Supported by:
顾勇翔1,2, 蓝鑫1,2, 伏博毅1,2, 秦小林1,2()
通讯作者:
秦小林
作者简介:
顾勇翔(1997—),男,江苏苏州人,硕士研究生,CCF会员,主要研究方向:深度学习、目标检测基金资助:
CLC Number:
Yongxiang GU, Xin LAN, Boyi FU, Xiaolin QIN. Object detection algorithm for remote sensing images based on geometric adaptation and global perception[J]. Journal of Computer Applications, 2023, 43(3): 916-922.
顾勇翔, 蓝鑫, 伏博毅, 秦小林. 基于几何适应与全局感知的遥感图像目标检测算法[J]. 《计算机应用》唯一官方网站, 2023, 43(3): 916-922.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2022010071
数据集 | 算法 | 参数量/106 | 浮点运算量/GFLOPs | 类别 | P/% | R/% | AP50/% | mAP/% |
---|---|---|---|---|---|---|---|---|
UCAS-AOD | YOLOv3-SPP | 62.6 | 155.8 | 汽车 | 91.3 | 93.1 | 92.9 | 56.5 |
飞机 | 98.6 | 98.7 | 99.3 | 72.9 | ||||
平均 | 95.0 | 95.9 | 96.1 | 64.7 | ||||
YOLOv5s6 | 12.4 | 16.8 | 汽车 | 89.7 | 92.4 | 91.8 | 55.1 | |
飞机 | 98.1 | 98.4 | 99.3 | 72.9 | ||||
平均 | 93.9 | 95.1 | 95.5 | 64.0 | ||||
本文算法 | 13.7 | 17.0 | 汽车 | 90.0 | 92.5 | 94.2 | 58.3 | |
飞机 | 97.6 | 98.3 | 99.3 | 73.4 | ||||
平均 | 93.8 | 95.4 | 96.7 | 65.8 | ||||
RSOD | YOLOv3-SPP | 62.6 | 155.8 | 飞机 | 89.9 | 90.1 | 94.1 | 61.8 |
油罐 | 95.1 | 98.1 | 98.4 | 77.4 | ||||
立交桥 | 86.6 | 71.8 | 78.2 | 36.5 | ||||
操场 | 82.1 | 100.0 | 99.1 | 85.2 | ||||
平均 | 88.4 | 90.0 | 92.4 | 65.2 | ||||
YOLOv5s6 | 12.4 | 16.8 | 飞机 | 97.3 | 82.9 | 93.9 | 64.3 | |
油罐 | 100.0 | 93.4 | 98.6 | 78.9 | ||||
立交桥 | 84.6 | 61.1 | 66.7 | 30.3 | ||||
操场 | 99.5 | 100.0 | 99.5 | 87.0 | ||||
平均 | 95.3 | 84.4 | 89.7 | 65.1 | ||||
本文算法 | 13.7 | 17.0 | 飞机 | 96.1 | 87.6 | 94.4 | 65.5 | |
油罐 | 99.5 | 94.7 | 97.8 | 80.1 | ||||
立交桥 | 80.0 | 66.7 | 66.9 | 33.4 | ||||
操场 | 86.5 | 100.0 | 99.5 | 87.3 | ||||
平均 | 90.5 | 87.2 | 89.6 | 66.6 |
Tab. 1 Comparison of detection results of different algorithms on the UCAS-AOD and RSOD datasets
数据集 | 算法 | 参数量/106 | 浮点运算量/GFLOPs | 类别 | P/% | R/% | AP50/% | mAP/% |
---|---|---|---|---|---|---|---|---|
UCAS-AOD | YOLOv3-SPP | 62.6 | 155.8 | 汽车 | 91.3 | 93.1 | 92.9 | 56.5 |
飞机 | 98.6 | 98.7 | 99.3 | 72.9 | ||||
平均 | 95.0 | 95.9 | 96.1 | 64.7 | ||||
YOLOv5s6 | 12.4 | 16.8 | 汽车 | 89.7 | 92.4 | 91.8 | 55.1 | |
飞机 | 98.1 | 98.4 | 99.3 | 72.9 | ||||
平均 | 93.9 | 95.1 | 95.5 | 64.0 | ||||
本文算法 | 13.7 | 17.0 | 汽车 | 90.0 | 92.5 | 94.2 | 58.3 | |
飞机 | 97.6 | 98.3 | 99.3 | 73.4 | ||||
平均 | 93.8 | 95.4 | 96.7 | 65.8 | ||||
RSOD | YOLOv3-SPP | 62.6 | 155.8 | 飞机 | 89.9 | 90.1 | 94.1 | 61.8 |
油罐 | 95.1 | 98.1 | 98.4 | 77.4 | ||||
立交桥 | 86.6 | 71.8 | 78.2 | 36.5 | ||||
操场 | 82.1 | 100.0 | 99.1 | 85.2 | ||||
平均 | 88.4 | 90.0 | 92.4 | 65.2 | ||||
YOLOv5s6 | 12.4 | 16.8 | 飞机 | 97.3 | 82.9 | 93.9 | 64.3 | |
油罐 | 100.0 | 93.4 | 98.6 | 78.9 | ||||
立交桥 | 84.6 | 61.1 | 66.7 | 30.3 | ||||
操场 | 99.5 | 100.0 | 99.5 | 87.0 | ||||
平均 | 95.3 | 84.4 | 89.7 | 65.1 | ||||
本文算法 | 13.7 | 17.0 | 飞机 | 96.1 | 87.6 | 94.4 | 65.5 | |
油罐 | 99.5 | 94.7 | 97.8 | 80.1 | ||||
立交桥 | 80.0 | 66.7 | 66.9 | 33.4 | ||||
操场 | 86.5 | 100.0 | 99.5 | 87.3 | ||||
平均 | 90.5 | 87.2 | 89.6 | 66.6 |
Transformer | CAM | DenseCAM | P/% | R/% | AP50/% | mAP/% | 参数量/106 | 浮点运算量/GFLOPs |
---|---|---|---|---|---|---|---|---|
— | — | — | 93.9 | 95.1 | 95.5 | 64.0 | 12.4 | 16.8 |
√ | — | — | 94.8 | 94.6 | 95.8 | 64.6 | 12.4 | 16.7 |
— | √ | — | 95.1 | 95.3 | 96.5 | 65.0 | 16.7 | 17.6 |
— | — | √ | 95.3 | 95.2 | 96.5 | 65.1 | 13.7 | 17.1 |
√ | — | √ | 93.8 | 95.4 | 96.7 | 65.8 | 13.7 | 17.0 |
Tab. 2 Results of ablation study on UCAS-AOD dataset
Transformer | CAM | DenseCAM | P/% | R/% | AP50/% | mAP/% | 参数量/106 | 浮点运算量/GFLOPs |
---|---|---|---|---|---|---|---|---|
— | — | — | 93.9 | 95.1 | 95.5 | 64.0 | 12.4 | 16.8 |
√ | — | — | 94.8 | 94.6 | 95.8 | 64.6 | 12.4 | 16.7 |
— | √ | — | 95.1 | 95.3 | 96.5 | 65.0 | 16.7 | 17.6 |
— | — | √ | 95.3 | 95.2 | 96.5 | 65.1 | 13.7 | 17.1 |
√ | — | √ | 93.8 | 95.4 | 96.7 | 65.8 | 13.7 | 17.0 |
1 | KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[C]// Proceedings of the 25th International Conference on Neural Information Processing Systems. Red Hook, NY: Curran Associates Inc., 2012: 1097-1105. |
2 | HU J, SHEN L, SUN G, et al. Squeeze-and-excitation networks[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 7132-7141. 10.1109/cvpr.2018.00745 |
3 | TAN M X, PANG R M, LE Q V. EfficientDet: scalable and efficient object detection[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 10778-10787. 10.1109/cvpr42600.2020.01079 |
4 | TAN J R, ZHANG G, DENG H M, et al. 1st place solution of LVIS Challenge 2020: a good box is not a guarantee of a good mask[EB/OL]. (2020-09-03) [2022-02-20].. |
5 | LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: common objects in context[C]// Proceedings of the 2014 European Conference on Computer Vision, LNCS 8693. Cham: Springer, 2014: 740-755. |
6 | LIU Z, HU H, LIN Y T, et al. Swin Transformer V2: scaling up capacity and resolution[C]// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 11999-12009. 10.1109/cvpr52688.2022.01170 |
7 | GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]// Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2014: 580-587. 10.1109/cvpr.2014.81 |
8 | REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]// Proceedings of the 2016 IEEE/CVF Conference on Computer Vision and Pattern Recognition Piscataway: IEEE, 2016: 779-788. 10.1109/cvpr.2016.91 |
9 | REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. 10.1109/tpami.2016.2577031 |
10 | JOCHER G. v5.0 -- YOLO v5-P6 1280 models, AWS, Supervisely and YouTube integrations[EB/OL] (2021-04-12) [2022-02-20]. . |
11 | WANG C Y, LIAO H Y M, WU Y H, et al. CSPNet: a new backbone that can enhance learning capability of CNN[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 1571-1580. 10.1109/cvprw50498.2020.00203 |
12 | HE K M, ZHANG X Y, REN S Q, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904-1916. 10.1109/tpami.2015.2389824 |
13 | LIU S, QI L, QIN H F, et al. Path aggregation network for instance segmentation[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 8759-8768. 10.1109/cvpr.2018.00913 |
14 | REZATOFIGHI H, TSOI N, GWAK J, et al. Generalized intersection over union: a metric and a loss for bounding box regression[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 658-666. 10.1109/cvpr.2019.00075 |
15 | ZHANG H Y, CISSE M, DAUPHIN Y N, et al. mixup: Beyond empirical risk minimization[EB/OL]. (2018-04-27) [2022-02-20].. |
16 | BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: optimal speed and accuracy of object detection[EB/OL]. (2020-04-23) [2022-02-20].. |
17 | ELFWING S, UCHIBE E, DOYA K. Sigmoid-weighted linear units for neural network function approximation in reinforcement learning[J]. Neural Networks, 2018, 107: 3-11. 10.1016/j.neunet.2017.12.012 |
18 | 高鑫,李慧,张义,等. 基于可变形卷积神经网络的遥感影像密集区域车辆检测方法[J]. 电子与信息学报, 2018, 40(12):2812-2819. 10.11999/JEIT180209 |
GAO X, LI H, ZHANG Y, et al. Vehicle detection in remote sensing images of dense areas based on deformable convolution neural network[J]. Journal of Electronics and Information Technology, 2018, 40(12): 2812-2819. 10.11999/JEIT180209 | |
19 | DAI J F, QI H Z, XIONG Y W, et al. Deformable convolutional networks[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 764-773. 10.1109/iccv.2017.89 |
20 | 胡滔. 基于深度特征增强的光学遥感目标检测技术研究[D]. 西安:西安电子科技大学, 2019:24-45. |
HU T. Research on optical remote sensing object detection technology based on deep feature enhancement[D]. Xi’an: Xidian University, 2019:24-45. | |
21 | 田婷婷,杨军. 基于多尺度特征融合网络的遥感影像目标检测[J]. 激光与光电子学进展, 2022, 59(16):427-435. |
TIAN T T, YANG J. Object detection for remote sensing image based on multiscale feature fusion network[J]. Laser and Optoelectronics Progress, 2022, 59(16):427-435. | |
22 | LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]// Proceedings of the 2017 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 936-944. 10.1109/cvpr.2017.106 |
23 | XU Y L, ZHU M M, XIN P, et al. Rapid airplane detection in remote sensing images based on multilayer feature fusion in fully convolutional neural networks[J]. Sensors, 2018, 18(7): No.2335. 10.3390/s18072335 |
24 | 汪亚妮,汪西莉. 基于注意力和特征融合的遥感图像目标检测模型[J]. 激光与光电子学进展, 2021, 58(2):363-371. 10.3788/LOP202158.0228003 |
WANG Y N, WANG X L. Remote sensing image target detection model based on attention and feature fusion[J]. Laser and Optoelectronics Progress, 2021, 58(2): 363-371. 10.3788/LOP202158.0228003 | |
25 | ZHU X Z, HU H, LIN S, et al. Deformable ConvNets v2: more deformable, better results[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 9300-9308. 10.1109/cvpr.2019.00953 |
26 | WOO S, PARK J, LEE J Y, et al. CBAM: Convolutional Block Attention Module [C]// Proceedings of the 2018 European Conference on Computer Vision. Cham: Springer, 2018: 3-19. 10.1007/978-3-030-01234-2_1 |
27 | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook, NY: Curran Associates Inc., 2017: 6000-6010. |
28 | ZHU H G, CHEN X G, DAI W Q, et al. Orientation robust object detection in aerial images using deep convolutional neural network[C]// Proceedings of the 2015 IEEE International Conference on Image Processing. Piscataway: IEEE, 2015: 3735-3739. 10.1109/icip.2015.7351502 |
29 | LONG Y, GONG Y P, XIAO Z F, et al. Accurate object localization in remote sensing images based on convolutional neural networks[J]. IEEE Transactions on Geoscience and Remote Sensing, 2017, 55(5): 2486-2498. 10.1109/tgrs.2016.2645610 |
30 | 李婕,周顺,朱鑫潮,等. 结合多通道注意力的遥感图像飞机目标检测[J]. 计算机工程与应用, 2022, 58(1):209-217. 10.3778/j.issn.1002-8331.2107-0379 |
LI J, ZHOU S, ZHU X C, et al. Remote sensing image aircraft target detection combined with multiple channel attention[J]. Computer Engineering and Applications, 2022, 58(1):209-217. 10.3778/j.issn.1002-8331.2107-0379 |
[1] | Yexin PAN, Zhe YANG. Optimization model for small object detection based on multi-level feature bidirectional fusion [J]. Journal of Computer Applications, 2024, 44(9): 2871-2877. |
[2] | Jinjin LI, Guoming SANG, Yijia ZHANG. Multi-domain fake news detection model enhanced by APK-CNN and Transformer [J]. Journal of Computer Applications, 2024, 44(9): 2674-2682. |
[3] | Liehong REN, Lyuwen HUANG, Xu TIAN, Fei DUAN. Multivariate long-term series forecasting method with DFT-based frequency-sensitive dual-branch Transformer [J]. Journal of Computer Applications, 2024, 44(9): 2739-2746. |
[4] | Jieru JIA, Jianchao YANG, Shuorui ZHANG, Tao YAN, Bin CHEN. Unsupervised person re-identification based on self-distilled vision Transformer [J]. Journal of Computer Applications, 2024, 44(9): 2893-2902. |
[5] | Yunchuan HUANG, Yongquan JIANG, Juntao HUANG, Yan YANG. Molecular toxicity prediction based on meta graph isomorphism network [J]. Journal of Computer Applications, 2024, 44(9): 2964-2969. |
[6] | Xin YANG, Xueni CHEN, Chunjiang WU, Shijie ZHOU. Short-term traffic flow prediction of urban highway based on variant residual model and Transformer [J]. Journal of Computer Applications, 2024, 44(9): 2947-2951. |
[7] | Jiepo FANG, Chongben TAO. Hybrid internet of vehicles intrusion detection system for zero-day attacks [J]. Journal of Computer Applications, 2024, 44(9): 2763-2769. |
[8] | Yuwei DING, Hongbo SHI, Jie LI, Min LIANG. Image denoising network based on local and global feature decoupling [J]. Journal of Computer Applications, 2024, 44(8): 2571-2579. |
[9] | Kaili DENG, Weibo WEI, Zhenkuan PAN. Industrial defect detection method with improved masked autoencoder [J]. Journal of Computer Applications, 2024, 44(8): 2595-2603. |
[10] | Chenqian LI, Jun LIU. Ultrasound carotid plaque segmentation method based on semi-supervision and multi-scale cascaded attention [J]. Journal of Computer Applications, 2024, 44(8): 2604-2610. |
[11] | Yanjie GU, Yingjun ZHANG, Xiaoqian LIU, Wei ZHOU, Wei SUN. Traffic flow forecasting via spatial-temporal multi-graph fusion [J]. Journal of Computer Applications, 2024, 44(8): 2618-2625. |
[12] | Fan YANG, Yao ZOU, Mingzhi ZHU, Zhenwei MA, Dawei CHENG, Changjun JIANG. Credit card fraud detection model based on graph attention Transformation neural network [J]. Journal of Computer Applications, 2024, 44(8): 2634-2642. |
[13] | Yeheng LI, Guangsheng LUO, Qianmin SU. Logo detection algorithm based on improved YOLOv5 [J]. Journal of Computer Applications, 2024, 44(8): 2580-2587. |
[14] | Yingjun ZHANG, Niuniu LI, Binhong XIE, Rui ZHANG, Wangdong LU. Semi-supervised object detection framework guided by curriculum learning [J]. Journal of Computer Applications, 2024, 44(8): 2326-2333. |
[15] | Yuan TANG, Yanping CHEN, Ying HU, Ruizhang HUANG, Yongbin QIN. Relation extraction model based on multi-scale hybrid attention convolutional neural networks [J]. Journal of Computer Applications, 2024, 44(7): 2011-2017. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||