Joint 1-2-order pooling network learning for remote sensing scene classification

doi:10.11772/j.issn.1001-9081.2021040647

Journal of Computer Applications ›› 2022, Vol. 42 ›› Issue (6): 1972-1978.DOI: 10.11772/j.issn.1001-9081.2021040647

• Multimedia computing and computer simulation • Previous Articles

Joint 1-2-order pooling network learning for remote sensing scene classification

Xiaoyong BIAN¹^,²^,³(), Xiongjun FEI¹, Chunfang CHEN¹, Dongdong KAN¹, Sheng DING¹^,²^,³

^1.School of Computer Science and Technology，Wuhan University of Science and Technology，Wuhan Hubei 430065，China
^2.Institute of Big Data Science and Engineering，Wuhan University of Science and Technology，Wuhan Hubei 430065，China
^3.Hubei Province Key Laboratory of Intelligent Information Processing and Real-time Industrial System （Wuhan University of Science and Technology），Wuhan Hubei 430065，China

Received:2021-04-23 Revised:2021-07-30 Accepted:2021-08-05 Online:2022-06-22 Published:2022-06-10
Contact: Xiaoyong BIAN
About author:FEI Xiongjun，born in 1992，M. S. candidate. His research interests include high-order pooling.
CHEN Chunfang，born in 1992，M. S. candidate. Her research interests include deep multi-instance learning.
KAN Dongdong，born in 1998，M. S. candidate. His research interests include high-order pooling.
DING Sheng，born in 1975，Ph. D.，associate professor. His research interests include object detection，deep learning
Supported by:
National Natural Science Foundation of China(61972299);Graduate Innovation Foundation of Wuhan University of Science and Technology(JCX201927)

联合一二阶池化网络学习的遥感场景分类

边小勇¹^,²^,³(), 费雄君¹, 陈春芳¹, 阚东东¹, 丁胜¹^,²^,³

^1.武汉科技大学计算机科学与技术学院, 武汉 430065
^2.武汉科技大学大数据科学与工程研究院, 武汉 430065
^3.智能信息处理与实时工业系统湖北省重点实验室(武汉科技大学), 武汉 430065

通讯作者: 边小勇
作者简介:边小勇（1976—），男，江西吉安人，副教授，博士，主要研究方向：机器学习、遥感场景分类
费雄君（1996—），男，湖北黄冈人，硕士研究生，主要研究方向：高阶池化
陈春芳（1996—），女，湖北荆州人，硕士研究生，主要研究方向：深度多示例学习
阚东东（1998—），男，湖北黄石人，硕士研究生，主要研究方向：高阶池化
丁胜（1975—），男，湖北武汉人，副教授，博士，主要研究方向：目标检测、深度学习。
基金资助:
国家自然科学基金资助项目(61972299);武汉科技大学研究生创新基金资助项目(JCX201927)

Abstract

Abstract:

At present， most pooling methods mainly extract aggregated feature information from the 1-order pooling layer or the 2-order pooling layer， ignoring the comprehensive representation capability of multiple pooling strategies for scenes， which affects the scene recognition performance. To address the above problems， a joint model with first- and second-order pooling networks learning for remote sensing scene classification was proposed. Firstly， the convolutional layers of residual network ResNet-50 were utilized to extract the initial features of the input images. Then， a second-order pooling approach based on the similarity of feature vectors was proposed， where the information distribution of feature values was modulated by deriving their weight coefficients from the similarity between feature vectors， and the efficient second-order feature information was calculated. Meanwhile， an approximate solving method for calculating square root of covariance matrix was introduced to obtain the second-order feature representation with higher semantic information. Finally， the entire network was trained with the combination loss function composed of cross-entropy and class-distance weighting. As a result， a discriminative classification model was achieved. The proposed method was tested on AID （50% training proportion）， NWPU-RESISC45 （20% training proportion）， CIFAR-10 and CIFAR-100 datasets and achieved classification accuracies of 96.32%， 93.38%， 96.51% and 83.30% respectively， which were increased by 1.09 percentage points， 0.55 percentage points， 1.05 percentage points and 1.57 percentage points respectively， compared with iterative matrix SQuare RooT normalization of COVariance pooling （iSQRT-COV）. Experimental results show that the proposed method effectively improves the performance of remote sensing scene classification.

Key words: remote sensing scene classification, deep learning, first-order pooling, second-order pooling, square root of covariance matrix

摘要：

目前大多数池化方法主要是从一阶池化层或二阶池化层提取聚合特征信息，忽略了多种池化策略对场景的综合表示能力，进而影响到场景识别性能。针对以上问题，提出了联合一二阶池化网络学习的遥感场景分类模型。首先，利用残差网络ResNet-50的卷积层提取输入图像的初始特征。接着，提出基于特征向量相似度的二阶池化方法，即通过特征向量间的相似度求出其权重系数来调制特征值的信息分布，并计算有效的二阶特征信息。同时，引入一种有效的协方差矩阵平方根逼近求解方法，以获得高阶语义信息的二阶特征表示。最后，基于交叉熵和类距离加权的组合损失函数训练整个网络，从而得到富于判别性的分类模型。所提方法在AID（50%训练比例）、NWPU-RESISC45 （20%训练比例）、CIFAR-10和CIFAR-100数据集上的分类准确率分别达到96.32%、93.38%、96.51%和83.30%，与iSQRT-COV方法相比，分别提高了1.09个百分点、0.55个百分点、1.05个百分点和1.57个百分点。实验结果表明，所提方法有效提高了遥感场景分类性能。

关键词: 遥感场景分类, 深度学习, 一阶池化, 二阶池化, 协方差矩阵平方根

CLC Number:

TP391.4

Xiaoyong BIAN, Xiongjun FEI, Chunfang CHEN, Dongdong KAN, Sheng DING. Joint 1-2-order pooling network learning for remote sensing scene classification[J]. Journal of Computer Applications, 2022, 42(6): 1972-1978.

边小勇, 费雄君, 陈春芳, 阚东东, 丁胜. 联合一二阶池化网络学习的遥感场景分类[J]. 《计算机应用》唯一官方网站, 2022, 42(6): 1972-1978.

Figures/Tables 7

References 25

1	KRIZHEVSKY A， SUTSKEVER I， HINTON G E. ImageNet classification with deep convolutional neural networks［C］// Proceedings of the 25th International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2012：1097-1105.
2	SIMONYAN K， ZISSERMAN A. Very deep convolutional networks for large-scale image recognition［EB/OL］. ［2021-02-17］.. 10.5244/c.28.6
3	LIN M， CHEN Q， YAN S C. Network in network［EB/OL］. （2015-04-10）［2021-02-17］.. 10.1109/icicta.2014.118
4	MURRAY N， PERRONNIN F. Generalized max pooling［C］// Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2014：2473-2480. 10.1109/cvpr.2014.317
5	XIE G S， ZHANG X Y， SHU X B， et al. Task-driven feature pooling for image classification［C］// Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2015：1179-1187. 10.1109/iccv.2015.140
6	WU M X， CHENG G， YAO X W， et al. Performance comparison of two pooling strategies for remote sensing image scene classification［C］// Proceedings of the 2019 IEEE International Geoscience and Remote Sensing Symposium. Piscataway： IEEE， 2019： 3037-3040. 10.1109/igarss.2019.8899877
7	HE K M， ZHANG X Y， REN S Q， et al. Spatial pyramid pooling in deep convolutional networks for visual recognition［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2015， 37（9）： 1904-1916. 10.1109/tpami.2015.2389824
8	LIN T Y， RoyCHOWDHURY A， MAJI S. Bilinear CNN models for fine-grained visual recognition［C］// Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2015：1449-1457. 10.1109/iccv.2015.170
9	LI P H， XIE J T， WANG Q L， et al. Is second-order information helpful for large-scale visual recognition？［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017：2089-2097. 10.1109/iccv.2017.228
10	LI P H， XIE J T， WANG Q L， et al. Towards faster training of global covariance pooling networks by iterative matrix square root normalization［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018：947-955. 10.1109/cvpr.2018.00105
11	WANG Q L， GAO Z L， XIE J T， et al. Global gated mixture of second-order pooling for improving deep convolutional neural networks［C］// Proceedings of the 32nd International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2018：1284-1293.
12	KIM J H， JUN J， ZHANG B T. Bilinear attention networks［C］// Proceedings of the 32nd International Conference on Neural Information Processing Systems. Red Hook， NY： Curran Associates Inc.， 2018：1571-1581.
13	HE N J， FANG L Y， LI Y， et al. High-order self-attention network for remote sensing scene classification［C］// Proceedings of the 2019 IEEE International Geoscience and Remote Sensing Symposium. Piscataway： IEEE， 2019： 3013-3016. 10.1109/igarss.2019.8898320
14	薛永杰，巨志勇. 注意力机制融合深度神经网络的室内场景识别方法［J］. 小型微型计算机系统， 2021， 42（5）： 1022-1028. 10.3969/j.issn.1000-1220.2021.05.021
	XUE Y J， JU Z Y. Method for recognizing indoor scene classification based on fusion deep neural network with attention mechanism［J］. Journal of Chinese Computer Systems， 2021， 42（5）：1022-1028. 10.3969/j.issn.1000-1220.2021.05.021
15	边小勇，江沛龄，赵敏，等. 基于多分支神经网络模型的弱监督细粒度图像分类方法［J］. 计算机应用， 2020， 40（5）：1295-1300.
	BIAN X Y， JIANG P L， ZHAO M， et al. Multi-branch neural network model based weakly supervised fine-grained image classification method［J］. Journal of Computer Applications， 2020， 40（5）：1295-1300.
16	LIN T Y， MAJI S. Improved bilinear pooling with CNNs［C］// Proceedings of the 2017 British Machine Vision Conference. Durham： BMVA Press， 2017： No.117. 10.5244/c.31.117
17	ZHAO Z Y， ZHANG K R， HAO X J， et al. BiRA-Net： bilinear attention net for diabetic retinopathy grading［C］// Proceedings of the 2019 IEEE International Conference on Image Processing. Piscataway： IEEE， 2019：395-399. 10.1109/icip.2019.8803074
18	XIA G S， HU J W， HU F， et al. AID： a benchmark data set for performance evaluation of aerial scene classification［J］. IEEE Transactions on Geoscience and Remote Sensing， 2017， 55（7）：3965-3981. 10.1109/tgrs.2017.2685945
19	CHENG G， HAN J W， LU X Q. Remote sensing image scene classification： benchmark and state of the art［J］. Proceedings of the IEEE， 2017， 105（10）：1865-1883. 10.1109/jproc.2017.2675998
20	KRIZHEVSKY A. Learning multiple layers of features from tiny images［R/OL］. （2009-04-08）［2021-02-17］..
21	HE K M， ZHANG X Y， REN S Q. Deep residual learning for image recognition［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016：770-778. 10.1109/cvpr.2016.90
22	HE N J， FANG L Y， LI S T， et al. Skip-connected covariance network for remote sensing scene classification［J］. IEEE Transactions on Neural Networks and Learning Systems， 2020， 31（5）： 1461-1474. 10.1109/tnnls.2019.2920374
23	ZAGORUYKO S， KOMODAKIS N. Wide residual networks［C］// Proceedings of the 2016 British Machine Vision Conference. Durham： BMVA Press， 2016： No.87. 10.5244/c.30.87
24	ZHONG X， GONG O B， HUANG W X， et al. Squeeze and excitation wide residual networks in image classification［C］// Proceedings of the 2019 IEEE International Conference on Image Processing. Piscataway： IEEE， 2019： 395-399. 10.1109/icip.2019.8803000
25	LUAN S Z， CHEN C， ZHANG B C， et al. Gabor convolutional networks［J］. IEEE Transactions on Image Processing， 2018， 27（9）： 4357-4366. 10.1109/tip.2018.2835143

方法	AID数据集		NWPU-RESISC45数据集
方法	20%训练比例	50%训练比例	10%训练比例	20%训练比例
GMP^［4］*	90.46	93.98	87.04	91.39
TDP^［5］*	90.87	94.23	87.56	91.45
ResNet-50^［21］*	92.89±0.31	95.49±0.08	89.03±0.28	91.96±0.08
VGG-16^［2］*	91.89±0.05	94.67±0.49	86.59±0.26	89.92±0.02
Bilinear-CNN^［8］*	91.68±0.35	93.34±0.31	85.49±0.07	88.84±0.12
iSQRT-COV^［10］*	93.28±0.10	95.23±0.38	90.19±0.20	92.83±0.07
GM-SOP^［11］*	86.43±0.23	89.95±0.16	80.19±0.33	85.51±0.19
SCCov^［22］	93.12±0.25	96.10±0.16	89.30±0.35	92.10±0.25
Ours（GAP+SOP）	93.80±0.05	96.32±0.19	90.77±0.05	93.38±0.16

方法	AID数据集		NWPU-RESISC45数据集
方法	20%训练比例	50%训练比例	10%训练比例	20%训练比例
GMP^［4］*	90.46	93.98	87.04	91.39
TDP^［5］*	90.87	94.23	87.56	91.45
ResNet-50^［21］*	92.89±0.31	95.49±0.08	89.03±0.28	91.96±0.08
VGG-16^［2］*	91.89±0.05	94.67±0.49	86.59±0.26	89.92±0.02
Bilinear-CNN^［8］*	91.68±0.35	93.34±0.31	85.49±0.07	88.84±0.12
iSQRT-COV^［10］*	93.28±0.10	95.23±0.38	90.19±0.20	92.83±0.07
GM-SOP^［11］*	86.43±0.23	89.95±0.16	80.19±0.33	85.51±0.19
SCCov^［22］	93.12±0.25	96.10±0.16	89.30±0.35	92.10±0.25
Ours（GAP+SOP）	93.80±0.05	96.32±0.19	90.77±0.05	93.38±0.16

方法	CIFAR-10	CIFAR-100
ResNet-50^［21］	93.57	74.84
WRN-28-10^［23］	96.11	80.19
SE-WRN^［24］	96.21	80.39
VGG-16^［2］	93.68	71.51
GCN^［25］	96.12	79.87
Bilinear-CNN^［8］*	92.58	73.29
iSQRT-COV^［10］*	95.46	81.73
Ours（GAP+SOP）	96.51	83.30

方法	CIFAR-10	CIFAR-100
ResNet-50^［21］	93.57	74.84
WRN-28-10^［23］	96.11	80.19
SE-WRN^［24］	96.21	80.39
VGG-16^［2］	93.68	71.51
GCN^［25］	96.12	79.87
Bilinear-CNN^［8］*	92.58	73.29
iSQRT-COV^［10］*	95.46	81.73
Ours（GAP+SOP）	96.51	83.30

方法	AID		NWPU-RESISC45		CIFAR-10	CIFAR-100
方法	20%训练比例	50%训练比例	10%训练比例	20%训练比例	CIFAR-10	CIFAR-100
Baseline	92.89	95.49	89.03	91.96	93.57	74.84
iSQRT-COV^［10］*	93.28	95.23	90.19	92.83	95.46	81.73
Ours（GAP）	92.89	95.49	89.03	91.96	93.57	74.84
Ours（SOP）	93.63	96.02	90.35	93.18	96.06	82.27
Ours（GAP+SOP）	93.80	96.32	90.77	93.38	96.51	83.30

Joint 1-2-order pooling network learning for remote sensing scene classification

联合一二阶池化网络学习的遥感场景分类

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 7

References 25

Related Articles 15

Recommended Articles

Metrics

[1]	Yumin HAN, Xiaoyan HAO. Material entity recognition based on subword embedding and relative attention [J]. Journal of Computer Applications, 2022, 42(6): 1862-1868.
[2]	Meng YU, Wentao HE, Xuchuan ZHOU, Mengtian CUI, Keqi WU, Wenjie ZHOU. Review of recommendation system [J]. Journal of Computer Applications, 2022, 42(6): 1898-1913.
[3]	Jia LI, Yuanlin ZHENG, Kaiyang LIAO, Haojie LOU, Shiyu LI, Zehao CHEN. No-reference image quality assessment algorithm based on saliency deep features [J]. Journal of Computer Applications, 2022, 42(6): 1957-1964.
[4]	Zhipei YANG, Sheng DING, Li ZHANG, Xinyu ZHANG. Anchor-free remote sensing image detection method for dense objects with rotation [J]. Journal of Computer Applications, 2022, 42(6): 1965-1971.
[5]	Yang ZHANG, Jiangbo HAO. Malicious code detection method based on attention mechanism and residual network [J]. Journal of Computer Applications, 2022, 42(6): 1708-1715.
[6]	Shan SU, Yang ZHANG, Dongwen ZHANG. Coupling related code smell detection method based on deep learning [J]. Journal of Computer Applications, 2022, 42(6): 1702-1707.
[7]	Jing JIANG, Yu CHEN, Jieping SUN, Shenggen JU. Integrating posterior probability calibration training into text classification algorithm [J]. Journal of Computer Applications, 2022, 42(6): 1789-1795.
[8]	Min WEN, Rongcun WANG, Shujuan JIANG. Source code vulnerability detection based on relational graph convolution network [J]. Journal of Computer Applications, 2022, 42(6): 1814-1821.
[9]	Zhen QU, Kunting LI, Zhixi FENG. Remote sensing image scene classification based on effective channel attention [J]. Journal of Computer Applications, 2022, 42(5): 1431-1439.
[10]	Yongru QIU, Guangle YAO, Jie FENG, Haoyu CUI. Single image de-raining algorithm based on semi-supervised learning [J]. Journal of Computer Applications, 2022, 42(5): 1577-1582.
[11]	Yongshuai LU, Yingjie TANG, Xinran MA. Low contrast filament sizing defect detection method of non-woven fabric based on deep feature fusion [J]. Journal of Computer Applications, 2022, 42(5): 1440-1446.
[12]	Xinlin XIE, Yi XIAO, Xinying XU. Lung nodule classification algorithm based on neural network architecture search [J]. Journal of Computer Applications, 2022, 42(5): 1424-1430.
[13]	Wei REN, Hexiang BAI. Multi-label image classification method based on global and local label relationship [J]. Journal of Computer Applications, 2022, 42(5): 1383-1390.
[14]	Yingjie WANG, Jiuqi ZHU, Zumin WANG, Fengbo BAI, Jian GONG. Review of applications of natural language processing in text sentiment analysis [J]. Journal of Computer Applications, 2022, 42(4): 1011-1020.
[15]	Jin ZHANG, Peiqi QU, Cheng SUN, Meng LUO. Safety helmet wearing detection algorithm based on improved YOLOv5 [J]. Journal of Computer Applications, 2022, 42(4): 1292-1300.