Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (3): 916-922.DOI: 10.11772/j.issn.1001-9081.2022010071
• Multimedia computing and computer simulation • Previous Articles
Yongxiang GU1,2, Xin LAN1,2, Boyi FU1,2, Xiaolin QIN1,2()
Received:
2022-01-19
Revised:
2022-03-01
Accepted:
2022-03-07
Online:
2022-03-11
Published:
2023-03-10
Contact:
Xiaolin QIN
About author:
GU Yongxiang, born in 1997, M. S. candidate. His research interests include deep learning, object detection.Supported by:
顾勇翔1,2, 蓝鑫1,2, 伏博毅1,2, 秦小林1,2()
通讯作者:
秦小林
作者简介:
顾勇翔(1997—),男,江苏苏州人,硕士研究生,CCF会员,主要研究方向:深度学习、目标检测基金资助:
CLC Number:
Yongxiang GU, Xin LAN, Boyi FU, Xiaolin QIN. Object detection algorithm for remote sensing images based on geometric adaptation and global perception[J]. Journal of Computer Applications, 2023, 43(3): 916-922.
顾勇翔, 蓝鑫, 伏博毅, 秦小林. 基于几何适应与全局感知的遥感图像目标检测算法[J]. 《计算机应用》唯一官方网站, 2023, 43(3): 916-922.
Add to citation manager EndNote|Ris|BibTeX
URL: http://www.joca.cn/EN/10.11772/j.issn.1001-9081.2022010071
数据集 | 算法 | 参数量/106 | 浮点运算量/GFLOPs | 类别 | P/% | R/% | AP50/% | mAP/% |
---|---|---|---|---|---|---|---|---|
UCAS-AOD | YOLOv3-SPP | 62.6 | 155.8 | 汽车 | 91.3 | 93.1 | 92.9 | 56.5 |
飞机 | 98.6 | 98.7 | 99.3 | 72.9 | ||||
平均 | 95.0 | 95.9 | 96.1 | 64.7 | ||||
YOLOv5s6 | 12.4 | 16.8 | 汽车 | 89.7 | 92.4 | 91.8 | 55.1 | |
飞机 | 98.1 | 98.4 | 99.3 | 72.9 | ||||
平均 | 93.9 | 95.1 | 95.5 | 64.0 | ||||
本文算法 | 13.7 | 17.0 | 汽车 | 90.0 | 92.5 | 94.2 | 58.3 | |
飞机 | 97.6 | 98.3 | 99.3 | 73.4 | ||||
平均 | 93.8 | 95.4 | 96.7 | 65.8 | ||||
RSOD | YOLOv3-SPP | 62.6 | 155.8 | 飞机 | 89.9 | 90.1 | 94.1 | 61.8 |
油罐 | 95.1 | 98.1 | 98.4 | 77.4 | ||||
立交桥 | 86.6 | 71.8 | 78.2 | 36.5 | ||||
操场 | 82.1 | 100.0 | 99.1 | 85.2 | ||||
平均 | 88.4 | 90.0 | 92.4 | 65.2 | ||||
YOLOv5s6 | 12.4 | 16.8 | 飞机 | 97.3 | 82.9 | 93.9 | 64.3 | |
油罐 | 100.0 | 93.4 | 98.6 | 78.9 | ||||
立交桥 | 84.6 | 61.1 | 66.7 | 30.3 | ||||
操场 | 99.5 | 100.0 | 99.5 | 87.0 | ||||
平均 | 95.3 | 84.4 | 89.7 | 65.1 | ||||
本文算法 | 13.7 | 17.0 | 飞机 | 96.1 | 87.6 | 94.4 | 65.5 | |
油罐 | 99.5 | 94.7 | 97.8 | 80.1 | ||||
立交桥 | 80.0 | 66.7 | 66.9 | 33.4 | ||||
操场 | 86.5 | 100.0 | 99.5 | 87.3 | ||||
平均 | 90.5 | 87.2 | 89.6 | 66.6 |
Tab. 1 Comparison of detection results of different algorithms on the UCAS-AOD and RSOD datasets
数据集 | 算法 | 参数量/106 | 浮点运算量/GFLOPs | 类别 | P/% | R/% | AP50/% | mAP/% |
---|---|---|---|---|---|---|---|---|
UCAS-AOD | YOLOv3-SPP | 62.6 | 155.8 | 汽车 | 91.3 | 93.1 | 92.9 | 56.5 |
飞机 | 98.6 | 98.7 | 99.3 | 72.9 | ||||
平均 | 95.0 | 95.9 | 96.1 | 64.7 | ||||
YOLOv5s6 | 12.4 | 16.8 | 汽车 | 89.7 | 92.4 | 91.8 | 55.1 | |
飞机 | 98.1 | 98.4 | 99.3 | 72.9 | ||||
平均 | 93.9 | 95.1 | 95.5 | 64.0 | ||||
本文算法 | 13.7 | 17.0 | 汽车 | 90.0 | 92.5 | 94.2 | 58.3 | |
飞机 | 97.6 | 98.3 | 99.3 | 73.4 | ||||
平均 | 93.8 | 95.4 | 96.7 | 65.8 | ||||
RSOD | YOLOv3-SPP | 62.6 | 155.8 | 飞机 | 89.9 | 90.1 | 94.1 | 61.8 |
油罐 | 95.1 | 98.1 | 98.4 | 77.4 | ||||
立交桥 | 86.6 | 71.8 | 78.2 | 36.5 | ||||
操场 | 82.1 | 100.0 | 99.1 | 85.2 | ||||
平均 | 88.4 | 90.0 | 92.4 | 65.2 | ||||
YOLOv5s6 | 12.4 | 16.8 | 飞机 | 97.3 | 82.9 | 93.9 | 64.3 | |
油罐 | 100.0 | 93.4 | 98.6 | 78.9 | ||||
立交桥 | 84.6 | 61.1 | 66.7 | 30.3 | ||||
操场 | 99.5 | 100.0 | 99.5 | 87.0 | ||||
平均 | 95.3 | 84.4 | 89.7 | 65.1 | ||||
本文算法 | 13.7 | 17.0 | 飞机 | 96.1 | 87.6 | 94.4 | 65.5 | |
油罐 | 99.5 | 94.7 | 97.8 | 80.1 | ||||
立交桥 | 80.0 | 66.7 | 66.9 | 33.4 | ||||
操场 | 86.5 | 100.0 | 99.5 | 87.3 | ||||
平均 | 90.5 | 87.2 | 89.6 | 66.6 |
Transformer | CAM | DenseCAM | P/% | R/% | AP50/% | mAP/% | 参数量/106 | 浮点运算量/GFLOPs |
---|---|---|---|---|---|---|---|---|
— | — | — | 93.9 | 95.1 | 95.5 | 64.0 | 12.4 | 16.8 |
√ | — | — | 94.8 | 94.6 | 95.8 | 64.6 | 12.4 | 16.7 |
— | √ | — | 95.1 | 95.3 | 96.5 | 65.0 | 16.7 | 17.6 |
— | — | √ | 95.3 | 95.2 | 96.5 | 65.1 | 13.7 | 17.1 |
√ | — | √ | 93.8 | 95.4 | 96.7 | 65.8 | 13.7 | 17.0 |
Tab. 2 Results of ablation study on UCAS-AOD dataset
Transformer | CAM | DenseCAM | P/% | R/% | AP50/% | mAP/% | 参数量/106 | 浮点运算量/GFLOPs |
---|---|---|---|---|---|---|---|---|
— | — | — | 93.9 | 95.1 | 95.5 | 64.0 | 12.4 | 16.8 |
√ | — | — | 94.8 | 94.6 | 95.8 | 64.6 | 12.4 | 16.7 |
— | √ | — | 95.1 | 95.3 | 96.5 | 65.0 | 16.7 | 17.6 |
— | — | √ | 95.3 | 95.2 | 96.5 | 65.1 | 13.7 | 17.1 |
√ | — | √ | 93.8 | 95.4 | 96.7 | 65.8 | 13.7 | 17.0 |
1 | KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[C]// Proceedings of the 25th International Conference on Neural Information Processing Systems. Red Hook, NY: Curran Associates Inc., 2012: 1097-1105. |
2 | HU J, SHEN L, SUN G, et al. Squeeze-and-excitation networks[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 7132-7141. 10.1109/cvpr.2018.00745 |
3 | TAN M X, PANG R M, LE Q V. EfficientDet: scalable and efficient object detection[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 10778-10787. 10.1109/cvpr42600.2020.01079 |
4 | TAN J R, ZHANG G, DENG H M, et al. 1st place solution of LVIS Challenge 2020: a good box is not a guarantee of a good mask[EB/OL]. (2020-09-03) [2022-02-20].. |
5 | LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: common objects in context[C]// Proceedings of the 2014 European Conference on Computer Vision, LNCS 8693. Cham: Springer, 2014: 740-755. |
6 | LIU Z, HU H, LIN Y T, et al. Swin Transformer V2: scaling up capacity and resolution[C]// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 11999-12009. 10.1109/cvpr52688.2022.01170 |
7 | GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]// Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2014: 580-587. 10.1109/cvpr.2014.81 |
8 | REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]// Proceedings of the 2016 IEEE/CVF Conference on Computer Vision and Pattern Recognition Piscataway: IEEE, 2016: 779-788. 10.1109/cvpr.2016.91 |
9 | REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. 10.1109/tpami.2016.2577031 |
10 | JOCHER G. v5.0 -- YOLO v5-P6 1280 models, AWS, Supervisely and YouTube integrations[EB/OL] (2021-04-12) [2022-02-20]. . |
11 | WANG C Y, LIAO H Y M, WU Y H, et al. CSPNet: a new backbone that can enhance learning capability of CNN[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 1571-1580. 10.1109/cvprw50498.2020.00203 |
12 | HE K M, ZHANG X Y, REN S Q, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904-1916. 10.1109/tpami.2015.2389824 |
13 | LIU S, QI L, QIN H F, et al. Path aggregation network for instance segmentation[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 8759-8768. 10.1109/cvpr.2018.00913 |
14 | REZATOFIGHI H, TSOI N, GWAK J, et al. Generalized intersection over union: a metric and a loss for bounding box regression[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 658-666. 10.1109/cvpr.2019.00075 |
15 | ZHANG H Y, CISSE M, DAUPHIN Y N, et al. mixup: Beyond empirical risk minimization[EB/OL]. (2018-04-27) [2022-02-20].. |
16 | BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: optimal speed and accuracy of object detection[EB/OL]. (2020-04-23) [2022-02-20].. |
17 | ELFWING S, UCHIBE E, DOYA K. Sigmoid-weighted linear units for neural network function approximation in reinforcement learning[J]. Neural Networks, 2018, 107: 3-11. 10.1016/j.neunet.2017.12.012 |
18 | 高鑫,李慧,张义,等. 基于可变形卷积神经网络的遥感影像密集区域车辆检测方法[J]. 电子与信息学报, 2018, 40(12):2812-2819. 10.11999/JEIT180209 |
GAO X, LI H, ZHANG Y, et al. Vehicle detection in remote sensing images of dense areas based on deformable convolution neural network[J]. Journal of Electronics and Information Technology, 2018, 40(12): 2812-2819. 10.11999/JEIT180209 | |
19 | DAI J F, QI H Z, XIONG Y W, et al. Deformable convolutional networks[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 764-773. 10.1109/iccv.2017.89 |
20 | 胡滔. 基于深度特征增强的光学遥感目标检测技术研究[D]. 西安:西安电子科技大学, 2019:24-45. |
HU T. Research on optical remote sensing object detection technology based on deep feature enhancement[D]. Xi’an: Xidian University, 2019:24-45. | |
21 | 田婷婷,杨军. 基于多尺度特征融合网络的遥感影像目标检测[J]. 激光与光电子学进展, 2022, 59(16):427-435. |
TIAN T T, YANG J. Object detection for remote sensing image based on multiscale feature fusion network[J]. Laser and Optoelectronics Progress, 2022, 59(16):427-435. | |
22 | LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]// Proceedings of the 2017 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 936-944. 10.1109/cvpr.2017.106 |
23 | XU Y L, ZHU M M, XIN P, et al. Rapid airplane detection in remote sensing images based on multilayer feature fusion in fully convolutional neural networks[J]. Sensors, 2018, 18(7): No.2335. 10.3390/s18072335 |
24 | 汪亚妮,汪西莉. 基于注意力和特征融合的遥感图像目标检测模型[J]. 激光与光电子学进展, 2021, 58(2):363-371. 10.3788/LOP202158.0228003 |
WANG Y N, WANG X L. Remote sensing image target detection model based on attention and feature fusion[J]. Laser and Optoelectronics Progress, 2021, 58(2): 363-371. 10.3788/LOP202158.0228003 | |
25 | ZHU X Z, HU H, LIN S, et al. Deformable ConvNets v2: more deformable, better results[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 9300-9308. 10.1109/cvpr.2019.00953 |
26 | WOO S, PARK J, LEE J Y, et al. CBAM: Convolutional Block Attention Module [C]// Proceedings of the 2018 European Conference on Computer Vision. Cham: Springer, 2018: 3-19. 10.1007/978-3-030-01234-2_1 |
27 | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook, NY: Curran Associates Inc., 2017: 6000-6010. |
28 | ZHU H G, CHEN X G, DAI W Q, et al. Orientation robust object detection in aerial images using deep convolutional neural network[C]// Proceedings of the 2015 IEEE International Conference on Image Processing. Piscataway: IEEE, 2015: 3735-3739. 10.1109/icip.2015.7351502 |
29 | LONG Y, GONG Y P, XIAO Z F, et al. Accurate object localization in remote sensing images based on convolutional neural networks[J]. IEEE Transactions on Geoscience and Remote Sensing, 2017, 55(5): 2486-2498. 10.1109/tgrs.2016.2645610 |
30 | 李婕,周顺,朱鑫潮,等. 结合多通道注意力的遥感图像飞机目标检测[J]. 计算机工程与应用, 2022, 58(1):209-217. 10.3778/j.issn.1002-8331.2107-0379 |
LI J, ZHOU S, ZHU X C, et al. Remote sensing image aircraft target detection combined with multiple channel attention[J]. Computer Engineering and Applications, 2022, 58(1):209-217. 10.3778/j.issn.1002-8331.2107-0379 |
[1] | Tao PENG, Yalong KANG, Feng YU, Zili ZHANG, Junping LIU, Xinrong HU, Ruhan HE, Li LI. Pedestrian trajectory prediction based on multi-head soft attention graph convolutional network [J]. Journal of Computer Applications, 2023, 43(3): 736-743. |
[2] | Peng WANG, Dawei ZHANG, Zhengjun LU, Linhao LI. Moving object detection based on reliability low-rank factorization and generalized diversity difference [J]. Journal of Computer Applications, 2023, 43(2): 514-520. |
[3] | Wenju LI, Gan ZHANG, Liu CUI, Wanghui CHU. Lightweight traffic sign recognition model based on coordinate attention [J]. Journal of Computer Applications, 2023, 43(2): 608-614. |
[4] | Chengyu LIN, Lei WANG, Cong XUE. Weakly-supervised text classification with label semantic enhancement [J]. Journal of Computer Applications, 2023, 43(2): 335-342. |
[5] | Ming XU, Linhao LI, Qiaoling QI, Liqin WANG. Abductive reasoning model based on attention balance list [J]. Journal of Computer Applications, 2023, 43(2): 349-355. |
[6] | Zihao GUO, Lele DONG, Zhijian QU. Arthropod object detection method based on improved Faster RCNN [J]. Journal of Computer Applications, 2023, 43(1): 88-97. |
[7] | Jiahang ZHOU, Hongjie XING. Novelty detection method based on dual autoencoders and Transformer network [J]. Journal of Computer Applications, 2023, 43(1): 22-29. |
[8] | Keyou GUO, Xue LI, Min YANG. Real‑time detection method of traffic information based on lightweight YOLOv4 [J]. Journal of Computer Applications, 2023, 43(1): 74-80. |
[9] | Zeqiang SUN, Bingcai CHEN, Xiaobo CUI, Lei WANG, Yanuo LU. Strip steel surface defect detection by YOLOv5 algorithm fusing frequency domain attention mechanism and decoupled head [J]. Journal of Computer Applications, 2023, 43(1): 242-249. |
[10] | Guanyou XU, Weisen FENG. Python named entity recognition model based on transformer [J]. Journal of Computer Applications, 2022, 42(9): 2693-2700. |
[11] | Jinghan YIN, Shaojun QU, Zekai YAO, Xuanye HU, Xiaoyu QIN, Pujing HUA. Traffic sign recognition model in haze weather based on YOLOv5 [J]. Journal of Computer Applications, 2022, 42(9): 2876-2884. |
[12] | Hanqing LIU, Xiaodong KANG, Fuqing ZHANG, Xiuyuan ZHAO, Jingyi YANG, Xiaotian WANG, Mengfan LI. Image detection algorithm of cerebral arterial stenosis by improved Libra region-convolutional neural network [J]. Journal of Computer Applications, 2022, 42(9): 2909-2916. |
[13] | Jiehang DENG, Wenquan GUO, Hanjie CHEN, Guosheng GU, Jingjian LIU, Yukun DU, Chao LIU, Xiaodong KANG, Jian ZHAO. Few-shot diatom detection combining multi-scale multi-head self-attention and online hard example mining [J]. Journal of Computer Applications, 2022, 42(8): 2593-2600. |
[14] | Xianjie ZHANG, Zhiming ZHANG. Handwritten English text recognition based on convolutional neural network and Transformer [J]. Journal of Computer Applications, 2022, 42(8): 2394-2400. |
[15] | Liying ZHANG, Chunjiang PANG, Xinying WANG, Guoliang LI. Multi-scale object detection algorithm based on improved YOLOv3 [J]. Journal of Computer Applications, 2022, 42(8): 2423-2431. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||