基于尺度注意力网络的遥感图像场景分类

doi:10.11772/j.issn.1001-9081.2019071314

计算机应用 ›› 2020, Vol. 40 ›› Issue (3): 872-877.DOI: 10.11772/j.issn.1001-9081.2019071314

• 虚拟现实与多媒体计算 • 上一篇下一篇

基于尺度注意力网络的遥感图像场景分类

边小勇^1,2,3, 费雄君^1,2,3, 穆楠^1,2,3

1. 武汉科技大学计算机科学与技术学院, 武汉 430065;
2. 武汉科技大学大数据科学与工程研究院, 武汉 430065;
3. 智能信息处理与实时工业系统湖北省重点实验室(武汉科技大学), 武汉 430065

收稿日期:2019-07-29 修回日期:2019-09-09 发布日期:2019-09-19 出版日期:2020-03-10
通讯作者: 边小勇
作者简介:边小勇(1976-),男,江西吉安人,副教授,博士,主要研究方向:遥感图像场景分类、特征学习;费雄君(1996-),男,湖北黄冈人,硕士研究生,主要研究方向:计算机视觉、深度学习;穆楠(1991-),男,河南南阳人,助理教授,博士,主要研究方向:计算机视觉、显著性检测。
基金资助:
国家自然科学基金资助项目（61572381， 61501337）；湖北省自然科学基金资助项目（2018CFB575）。

Remote sensing image scene classification based on scale-attention network

BIAN Xiaoyong^1,2,3, FEI Xiongjun^1,2,3, MU Nan^1,2,3

1. School of Computer Science and Technology, Wuhan University of Science and Technology, Wuhan Hubei 430065, China;
2. Institute of Big Data Science and Engineering, Wuhan University of Science and Technology, Wuhan Hubei 430065, China;
3. Hubei Key Laboratory of Intelligent Information Processing and Real-time Industrial System(Wuhan University of Science and Technology), Wuhan Hubei 430065, China

Received:2019-07-29 Revised:2019-09-09 Online:2019-09-19 Published:2020-03-10
Supported by:
This work is partially supported by the National Natural Science Foundation of China (61572381, 61501337), the Natural Science Fund of Hubei Province (2018CFB575).

摘要/Abstract

摘要： 针对卷积神经网络（CNN）平等地对待输入图像中潜在的对象信息和背景信息，而遥感图像场景又存在许多小对象和背景复杂的问题，提出一种基于注意力机制和多尺度特征变换的尺度注意力网络模型。首先，开发一个快速有效的注意力模块，基于最优特征选择生成注意力图；然后，在ResNet50网络结构的基础上嵌入注意力图，增加多尺度特征融合层，并重新设计全连接层，构成尺度注意力网络；其次，利用预训练模型初始化尺度注意力网络，并使用训练集对模型进行微调；最后，利用微调后的尺度注意力网络对测试集进行分类预测。该方法在实验数据集AID上的分类准确率达到95.72%，与ArcNet方法相比分类准确率提高了2.62个百分点；在实验数据集NWPU-RESISC上分类准确率达到92.25%，与IORN方法相比分类准确率提高了0.95个百分点。实验结果表明，所提方法能够有效提高遥感图像场景分类准确率。

关键词: 遥感图像场景分类, 深度学习, 多尺度特征变换, 注意力机制, 残差网络, 微调

Abstract: The Convolutional Neural Network (CNN) treats the potential object information and background information equally in the input image. However, there are many small objects and complex background in remote sensing scene images. To solve the problem above, a scale-attention network was proposed based on attention mechanism and multi-scale feature transformation. Firstly, a fast and effective attention module was developed, and the attention map was generated based on optimal feature selection. Then, with the attention map embedded, the multi-scale feature fusion layer added and the fully connected layer redesigned on the basis of ResNet50 network, a scale attention network was proposed. Secondly, the pre-training model was used to initialize the scale-attention network, and the training set was employed for the fine-tuning of the network. Finally, the fine-tuned scale-attention network was used to realize the classification prediction of test set. The classification accuracy of the proposed method on the AID scene dataset is 95.72%, which is 2.62 percentage points higher than that of ArcNet. On the NWPU-RESISC scene dataset, this method achieves classification accuracy of 92.25%, 0.95 percentage points higher than that of IORN （Improved Oriented Response Network）. The experimental results demonstrate that the proposed method is able to improve the classification accuracy of remote sensing image scenes.

Key words: remote sensing image scene classification, deep learning, multi-scale feature transformation, attention mechanism, residual network, fine-tuning

中图分类号:

TP391.4

边小勇, 费雄君, 穆楠. 基于尺度注意力网络的遥感图像场景分类[J]. 计算机应用, 2020, 40(3): 872-877.

BIAN Xiaoyong, FEI Xiongjun, MU Nan. Remote sensing image scene classification based on scale-attention network[J]. Journal of Computer Applications, 2020, 40(3): 872-877.

参考文献

[1] PENATTI O A B,NOGUEIRA K,DOS SANTOS J A. Do deep features generalize from everyday objects to remote sensing and aerial scenes domains?[C]//Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops. Piscataway:IEEE,2015:44-51.
[2] 田艳玲, 张维桐, 张锲石, 等. 图像场景分类技术综述[J]. 电子学报,2019,47(4):915-926.(TIAN Y L,ZHANG W T,ZHANG Q S,et al. Review on image scene classification technology[J]. Acta Electronica Sinica,2019,47(4):915-926.)
[3] PHILBIN J,CHUM O,ISARD M,et al. Object retrieval with large vocabularies and fast spatial matching[C]//Proceedings of the 2007 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2007:1-8.
[4] JÉGOU H,DOUZE M,SCHMID C,et al. Aggregating local descriptors into a compact image representation[C]//Proceedings of the 2010 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2010:3304-3311.
[5] JAAKKOLA T S,HAUSSLER D. Exploiting generative models in discriminative classifiers[C]//Proceedings of the 1998 Conference on Neural Information Processing Systems. Cambridge:MIT Press, 1998:487-493.
[6] BIAN X,CHEN C,TIAN L,et al. Fusing local and global features for high-resolution scene classification[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2017,10(6):2889-2901.
[7] HUANG L,CHEN C,LI W,et al. Remote sensing image scene classification using multi-scale completed local binary patterns and fisher vectors[J]. Remote Sensing,2016,8(6):No. 483.
[8] KRIZHEVSKY A,SUTSKEVER I,HINTON G E. ImageNet classification with deep convolutional neural networks[C]//Proceedings of the 25th International Conference on Neural Information Processing Systems. New York:ACM,2012:1097-1105.
[9] SIMONYAN K,ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[EB/OL].[2018-10-17]. https://arxiv.org/pdf/1409.1556.pdf.
[10] SZEGEDY C,LIU W,JIA Y,et al. Going deeper with convolutions[C]//Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2015:1-9.
[11] HE K,ZHANG X,REN S,et al. Deep residual learning for image recognition[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2016:770-778.
[12] ZHOU Y,YE Q,QIU Q,et al. Oriented response networks[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2017:4961-4970.
[13] LUAN S,CHEN C,ZHANG B,et al. Gabor convolutional networks[J]. IEEE Transactions on Image Processing,2018,27(9):4357-4366.
[14] WOO S,PARK J,LEE J Y,et al. CBAM:convolutional block attention module[EB/OL].[2019-03-10]. https://arxiv.org/pdf/1807.06521.pdf.
[15] XIA G,HU J,HU F,et al. AID:a benchmark data set for performance evaluation of aerial scene classification[J]. IEEE Transactions on Geoscience and Remote Sensing,2017,55(7):3965-3981.
[16] CHENG G,HAN J,LU X. Remote sensing image scene classification:Benchmark and state of the art[J]. Proceedings of the IEEE, 2017,105(10):1865-1883.
[17] 许夙晖, 慕晓冬, 赵鹏,等. 利用多尺度特征与深度网络对遥感影像进行场景分类[J]. 测绘学报,2016,45(7):834-840.(XU S H,MU X D,ZHAO P,et al. Scene classification of remote sensing image based on multi-scale feature and deep neural network[J]. Acta Geodaetica et Cartographica Sinica,2016,45(7):834-840.)
[18] WANG J,LIU W,MA L,et al. IORN:an effective remote sensing image scene classification framework[J]. IEEE Geoscience and Remote Sensing Letters,2018,15(11):1695-1699.
[19] CHEN Z,WANG S,HOU X,et al. Recurrent transformer network for remote sensing scene categorization[C]//Proceedings of the 2018 British Machine Vision Conference. Durham:BMVA, 2018:No. 987.
[20] JADERBERG M,SIMONYAN K,ZISSERMAN A,et al. Spatial transformer networks[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems. Cambridge:MIT Press,2015:2017-2025.
[21] WANG Q,LIU S,CHANUSSOT J,et al. Scene classification with recurrent attention of VHR remote sensing images[J]. IEEE Transactions on Geoscience and Remote Sensing,2019,57(2):1155-1167.
[22] BIAN X, CHEN C, SHENG Y, et al. Fusing two convolutional neural networks for high-resolution scene classification[C]//Proceedings of the 37th International Conference on Geoscience and Remote Sensing Symposium. Piscataway:IEEE, 2017:3242-3245.
[23] MU N,XU X,WANG Y,et al. A multiscale superpixel-level salient object detection model using local-global contrast cue[J]. Journal of Shanghai Jiaotong University(Science),2017,22(1):121-128.
[24] WANG S,LUO L,ZHANG N,et al. AutoScaler:scale-attention networks for visual correspondence[EB/OL].[2019-4-13]. https://arxiv.org/pdf/1611.05837.pdf.
[25] ACHANTA R,SHAJI A,SMITH K,et al. SLIC superpixels compared to state-of-the-art superpixel methods[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2012,34(11):2274-2282.

基于尺度注意力网络的遥感图像场景分类

Remote sensing image scene classification based on scale-attention network

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

[1]	黄云川, 江永全, 黄骏涛, 杨燕. 基于元图同构网络的分子毒性预测[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2964-2969.
[2]	秦璟, 秦志光, 李发礼, 彭悦恒. 基于概率稀疏自注意力神经网络的重性抑郁疾患诊断[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2970-2974.
[3]	王熙源, 张战成, 徐少康, 张宝成, 罗晓清, 胡伏原. 面向手术导航3D/2D配准的无监督跨域迁移网络[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2911-2918.
[4]	李力铤, 华蓓, 贺若舟, 徐况. 基于解耦注意力机制的多变量时序预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2732-2738.
[5]	潘烨新, 杨哲. 基于多级特征双向融合的小目标检测优化模型[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2871-2877.
[6]	赵志强, 马培红, 黑新宏. 基于双重注意力机制的人群计数方法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2886-2892.
[7]	李顺勇, 李师毅, 胥瑞, 赵兴旺. 基于自注意力融合的不完整多视图聚类算法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2696-2703.
[8]	薛凯鹏, 徐涛, 廖春节. 融合自监督和多层交叉注意力的多模态情感分析网络[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2387-2392.
[9]	汪雨晴, 朱广丽, 段文杰, 李书羽, 周若彤. 基于交互注意力机制的心理咨询文本情感分类模型[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2393-2399.
[10]	高鹏淇, 黄鹤鸣, 樊永红. 融合坐标与多头注意力机制的交互语音情感识别[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2400-2406.
[11]	刘禹含, 吉根林, 张红苹. 基于骨架图与混合注意力的视频行人异常检测方法[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2551-2557.
[12]	李钟华, 白云起, 王雪津, 黄雷雷, 林初俊, 廖诗宇. 基于图像增强的低照度人脸检测[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2588-2594.
[13]	莫尚斌, 王文君, 董凌, 高盛祥, 余正涛. 基于多路信息聚合协同解码的单通道语音增强[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2611-2617.
[14]	顾焰杰, 张英俊, 刘晓倩, 周围, 孙威. 基于时空多图融合的交通流量预测[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2618-2625.
[15]	石乾宏, 杨燕, 江永全, 欧阳小草, 范武波, 陈强, 姜涛, 李媛. 面向空气质量预测的多粒度突变拟合网络[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2643-2650.