计算机应用 ›› 2020, Vol. 40 ›› Issue (3): 872-877.DOI: 10.11772/j.issn.1001-9081.2019071314

• 虚拟现实与多媒体计算 • 上一篇    下一篇

基于尺度注意力网络的遥感图像场景分类

边小勇1,2,3, 费雄君1,2,3, 穆楠1,2,3   

  1. 1. 武汉科技大学 计算机科学与技术学院, 武汉 430065;
    2. 武汉科技大学 大数据科学与工程研究院, 武汉 430065;
    3. 智能信息处理与实时工业系统湖北省重点实验室(武汉科技大学), 武汉 430065
  • 收稿日期:2019-07-29 修回日期:2019-09-09 出版日期:2020-03-10 发布日期:2019-09-19
  • 通讯作者: 边小勇
  • 作者简介:边小勇(1976-),男,江西吉安人,副教授,博士,主要研究方向:遥感图像场景分类、特征学习;费雄君(1996-),男,湖北黄冈人,硕士研究生,主要研究方向:计算机视觉、深度学习;穆楠(1991-),男,河南南阳人,助理教授,博士,主要研究方向:计算机视觉、显著性检测。
  • 基金资助:
    国家自然科学基金资助项目(61572381, 61501337);湖北省自然科学基金资助项目(2018CFB575)。

Remote sensing image scene classification based on scale-attention network

BIAN Xiaoyong1,2,3, FEI Xiongjun1,2,3, MU Nan1,2,3   

  1. 1. School of Computer Science and Technology, Wuhan University of Science and Technology, Wuhan Hubei 430065, China;
    2. Institute of Big Data Science and Engineering, Wuhan University of Science and Technology, Wuhan Hubei 430065, China;
    3. Hubei Key Laboratory of Intelligent Information Processing and Real-time Industrial System(Wuhan University of Science and Technology), Wuhan Hubei 430065, China
  • Received:2019-07-29 Revised:2019-09-09 Online:2020-03-10 Published:2019-09-19
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (61572381, 61501337), the Natural Science Fund of Hubei Province (2018CFB575).

摘要: 针对卷积神经网络(CNN)平等地对待输入图像中潜在的对象信息和背景信息,而遥感图像场景又存在许多小对象和背景复杂的问题,提出一种基于注意力机制和多尺度特征变换的尺度注意力网络模型。首先,开发一个快速有效的注意力模块,基于最优特征选择生成注意力图;然后,在ResNet50网络结构的基础上嵌入注意力图,增加多尺度特征融合层,并重新设计全连接层,构成尺度注意力网络;其次,利用预训练模型初始化尺度注意力网络,并使用训练集对模型进行微调;最后,利用微调后的尺度注意力网络对测试集进行分类预测。该方法在实验数据集AID上的分类准确率达到95.72%,与ArcNet方法相比分类准确率提高了2.62个百分点;在实验数据集NWPU-RESISC上分类准确率达到92.25%,与IORN方法相比分类准确率提高了0.95个百分点。实验结果表明,所提方法能够有效提高遥感图像场景分类准确率。

关键词: 遥感图像场景分类, 深度学习, 多尺度特征变换, 注意力机制, 残差网络, 微调

Abstract: The Convolutional Neural Network (CNN) treats the potential object information and background information equally in the input image. However, there are many small objects and complex background in remote sensing scene images. To solve the problem above, a scale-attention network was proposed based on attention mechanism and multi-scale feature transformation. Firstly, a fast and effective attention module was developed, and the attention map was generated based on optimal feature selection. Then, with the attention map embedded, the multi-scale feature fusion layer added and the fully connected layer redesigned on the basis of ResNet50 network, a scale attention network was proposed. Secondly, the pre-training model was used to initialize the scale-attention network, and the training set was employed for the fine-tuning of the network. Finally, the fine-tuned scale-attention network was used to realize the classification prediction of test set. The classification accuracy of the proposed method on the AID scene dataset is 95.72%, which is 2.62 percentage points higher than that of ArcNet. On the NWPU-RESISC scene dataset, this method achieves classification accuracy of 92.25%, 0.95 percentage points higher than that of IORN (Improved Oriented Response Network). The experimental results demonstrate that the proposed method is able to improve the classification accuracy of remote sensing image scenes.

Key words: remote sensing image scene classification, deep learning, multi-scale feature transformation, attention mechanism, residual network, fine-tuning

中图分类号: