Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (3): 916-922.DOI: 10.11772/j.issn.1001-9081.2022010071

Special Issue: 多媒体计算与计算机仿真

• Multimedia computing and computer simulation • Previous Articles     Next Articles

Object detection algorithm for remote sensing images based on geometric adaptation and global perception

Yongxiang GU1,2, Xin LAN1,2, Boyi FU1,2, Xiaolin QIN1,2()   

  1. 1.Chengdu Institute of Computer Application,Chinese Academy of Sciences,Chengdu Sichuan 610041,China
    2.University of Chinese Academy of Sciences,Beijing 100049,China
  • Received:2022-01-19 Revised:2022-03-01 Accepted:2022-03-07 Online:2022-03-11 Published:2023-03-10
  • Contact: Xiaolin QIN
  • About author:GU Yongxiang, born in 1997, M. S. candidate. His research interests include deep learning, object detection.
    LAN Xin, born in 1998, M. S. candidate. Her research interests include deep learning, object detection.
    FU Boyi, born in 1998, M. S. candidate. Her research interests include deep learning, object detection.
  • Supported by:
    National Academy of Science Alliance Collaborative Program (Chengdu Branch of Chinese Academy of Sciences - Chongqing Academy of Science and Technology), Science and Technology Service Network Initiative (STS) Key Regional Program (Type A) of Chinese Academy of Sciences(KFJ-STS-QYZD-2021-21-001);Sichuan Science and Technology Program(2019ZDZX0006);"Western Young Scholars" Project of Chinese Academy of Sciences(201899);Talent Special Project of Organization Department of Sichuan Provincial Party Committee


顾勇翔1,2, 蓝鑫1,2, 伏博毅1,2, 秦小林1,2()   

  1. 1.中国科学院 成都计算机应用研究所,成都 610041
    2.中国科学院大学,北京 100049
  • 通讯作者: 秦小林
  • 作者简介:顾勇翔(1997—),男,江苏苏州人,硕士研究生,CCF会员,主要研究方向:深度学习、目标检测
  • 基金资助:


Aiming at the problems such as small object size, arbitrary object direction and complex background of remote sensing images, on the basis of YOLOv5 (You Only Look Once version 5) algorithm, an algorithm involved with geometric adaptation and global perception was proposed. Firstly, deformable convolutions and adaptive spatial attention modules were stacked alternately in series through dense connections. As a result, a Dense Context-Aware Module (DenseCAM) which can model local geometric features was constructed on the basis of taking full advantage of different levels of semantic and location information. Secondly, by introducing Transformer in the end of the backbone network, the global perception ability of the model was enhanced at a low cost and the relationships between objects and scenario content were modeled. On UCAS-AOD and RSOD datasets, compared with YOLOv5s6 algorithm, the proposed algorithm has the mean Average Precision (mAP) increased by 1.8 percentage points and 1.5 percentage points, respectively. Experimental results show that the proposed algorithm can effectively improve the precision of object detection in remote sensing images.

Key words: remote sensing image, object detection, Transformer, deformable convolution, spatial attention, YOLOv5



关键词: 遥感图像, 目标检测, Transformer, 可变形卷积, 空间注意力, YOLOv5

CLC Number: