Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (3): 916-922.DOI: 10.11772/j.issn.1001-9081.2022010071

• Multimedia computing and computer simulation • Previous Articles    

Object detection algorithm for remote sensing images based on geometric adaptation and global perception

Yongxiang GU1,2, Xin LAN1,2, Boyi FU1,2, Xiaolin QIN1,2()   

  1. 1.Chengdu Institute of Computer Application,Chinese Academy of Sciences,Chengdu Sichuan 610041,China
    2.University of Chinese Academy of Sciences,Beijing 100049,China
  • Received:2022-01-19 Revised:2022-03-01 Accepted:2022-03-07 Online:2022-03-11 Published:2023-03-10
  • Contact: Xiaolin QIN
  • About author:GU Yongxiang, born in 1997, M. S. candidate. His research interests include deep learning, object detection.
    LAN Xin, born in 1998, M. S. candidate. Her research interests include deep learning, object detection.
    FU Boyi, born in 1998, M. S. candidate. Her research interests include deep learning, object detection.
  • Supported by:
    National Academy of Science Alliance Collaborative Program (Chengdu Branch of Chinese Academy of Sciences - Chongqing Academy of Science and Technology), Science and Technology Service Network Initiative (STS) Key Regional Program (Type A) of Chinese Academy of Sciences(KFJ-STS-QYZD-2021-21-001);Sichuan Science and Technology Program(2019ZDZX0006);"Western Young Scholars" Project of Chinese Academy of Sciences(201899);Talent Special Project of Organization Department of Sichuan Provincial Party Committee

基于几何适应与全局感知的遥感图像目标检测算法

顾勇翔1,2, 蓝鑫1,2, 伏博毅1,2, 秦小林1,2()   

  1. 1.中国科学院 成都计算机应用研究所,成都 610041
    2.中国科学院大学,北京 100049
  • 通讯作者: 秦小林
  • 作者简介:顾勇翔(1997—),男,江苏苏州人,硕士研究生,CCF会员,主要研究方向:深度学习、目标检测
    蓝鑫(1998—),女,福建龙岩人,硕士研究生,CCF会员,主要研究方向:深度学习、目标检测
    伏博毅(1998—),女,湖南岳阳人,硕士研究生,CCF会员,主要研究方向:深度学习、目标检测
    秦小林(1980—),男,重庆人,研究员,博士,CCF会员,主要研究方向:自动推理、人工智能。
  • 基金资助:
    全国科学院联盟合作项目(中国科学院成都分院-重庆科学技术研究院);中科院STS区域重点项目(A类)(KFJ-STS-QYZD-2021-21-001);四川省科技计划资助项目(2019ZDZX0006);中国科学院“西部青年学者”项目(201899);四川省委组织部人才专项

Abstract:

Aiming at the problems such as small object size, arbitrary object direction and complex background of remote sensing images, on the basis of YOLOv5 (You Only Look Once version 5) algorithm, an algorithm involved with geometric adaptation and global perception was proposed. Firstly, deformable convolutions and adaptive spatial attention modules were stacked alternately in series through dense connections. As a result, a Dense Context-Aware Module (DenseCAM) which can model local geometric features was constructed on the basis of taking full advantage of different levels of semantic and location information. Secondly, by introducing Transformer in the end of the backbone network, the global perception ability of the model was enhanced at a low cost and the relationships between objects and scenario content were modeled. On UCAS-AOD and RSOD datasets, compared with YOLOv5s6 algorithm, the proposed algorithm has the mean Average Precision (mAP) increased by 1.8 percentage points and 1.5 percentage points, respectively. Experimental results show that the proposed algorithm can effectively improve the precision of object detection in remote sensing images.

Key words: remote sensing image, object detection, Transformer, deformable convolution, spatial attention, YOLOv5

摘要:

针对遥感图像目标尺寸小、目标方向任意和背景复杂等问题,在YOLOv5算法的基础上,提出一种基于几何适应与全局感知的遥感图像目标检测算法。首先,将可变形卷积与自适应空间注意力模块通过密集连接交替串联堆叠,在充分利用不同层级的语义和位置信息基础上,构建一个能够建模局部几何特征的密集上下文感知模块(DenseCAM);其次,在骨干网络末端引入Transformer,以较低的开销增强模型的全局感知能力,实现目标与场景内容的关系建模。在UCAS-AOD和RSOD数据集上与YOLOv5s6算法相比,所提算法的平均精度均值(mAP)分别提高1.8与1.5个百分点。实验结果表明,所提算法能够有效提高遥感图像目标检测的精度。

关键词: 遥感图像, 目标检测, Transformer, 可变形卷积, 空间注意力, YOLOv5

CLC Number: