Journal of Computer Applications ›› 2024, Vol. 44 ›› Issue (3): 737-744.DOI: 10.11772/j.issn.1001-9081.2023040439
Special Issue: 人工智能
• Artificial intelligence • Previous Articles Next Articles
Ning WU1,2, Yangyang LUO1, Huajie XU1,3()
Received:
2023-04-18
Revised:
2023-06-26
Accepted:
2023-06-30
Online:
2023-12-04
Published:
2024-03-10
Contact:
Huajie XU
About author:
WU Ning, born in 1980, Ph. D., research fellow. His research interests include image processing, pattern recognition, machine vision.Supported by:
通讯作者:
许华杰
作者简介:
吴宁(1980—),男,广西贵港人,研究员,博士,主要研究方向:图像处理、模式识别、机器视觉基金资助:
CLC Number:
Ning WU, Yangyang LUO, Huajie XU. Semantic segmentation method for remote sensing images based on multi-scale feature fusion[J]. Journal of Computer Applications, 2024, 44(3): 737-744.
吴宁, 罗杨洋, 许华杰. 基于多尺度特征融合的遥感图像语义分割方法[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 737-744.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2023040439
方法类别 | 方法名称 | 不同类别的IoU/% | 参数量/MB | 计算量/GFLOPs | mPA/% | mIoU/% | ||||
---|---|---|---|---|---|---|---|---|---|---|
不透水表面 | 建筑物 | 低植被 | 树木 | 汽车 | ||||||
CNN-base | PSPNet[ | 85.56 | 94.06 | 77.88 | 78.31 | 76.63 | 46.60 | 5.32 | 89.92 | 82.49 |
FCN[ | 85.33 | 94.23 | 77.42 | 77.31 | 78.22 | 47.13 | 5.49 | 89.96 | 82.51 | |
DeepLabV3[ | 85.56 | 94.04 | 77.79 | 78.35 | 78.46 | 65.74 | 6.36 | 90.69 | 82.84 | |
Transformer-base | SETR[ | 82.03 | 93.98 | 76.72 | 77.62 | 77.43 | 310.65 | 40.66 | 88.43 | 81.56 |
Segmenter[ | 83.19 | 93.92 | 77.80 | 78.76 | 78.92 | 102.39 | 13.42 | 90.10 | 82.52 | |
SegFormer[ | 85.61 | 92.09 | 78.07 | 76.80 | 89.01 | 3.72 | 1.22 | 91.75 | 84.32 | |
FuseSwin | 87.00 | 94.00 | 79.56 | 78.86 | 90.93 | 56.94 | 73.98 | 93.03 | 86.07 |
Tab. 1 Comparison results of different methods on Potsdam dataset
方法类别 | 方法名称 | 不同类别的IoU/% | 参数量/MB | 计算量/GFLOPs | mPA/% | mIoU/% | ||||
---|---|---|---|---|---|---|---|---|---|---|
不透水表面 | 建筑物 | 低植被 | 树木 | 汽车 | ||||||
CNN-base | PSPNet[ | 85.56 | 94.06 | 77.88 | 78.31 | 76.63 | 46.60 | 5.32 | 89.92 | 82.49 |
FCN[ | 85.33 | 94.23 | 77.42 | 77.31 | 78.22 | 47.13 | 5.49 | 89.96 | 82.51 | |
DeepLabV3[ | 85.56 | 94.04 | 77.79 | 78.35 | 78.46 | 65.74 | 6.36 | 90.69 | 82.84 | |
Transformer-base | SETR[ | 82.03 | 93.98 | 76.72 | 77.62 | 77.43 | 310.65 | 40.66 | 88.43 | 81.56 |
Segmenter[ | 83.19 | 93.92 | 77.80 | 78.76 | 78.92 | 102.39 | 13.42 | 90.10 | 82.52 | |
SegFormer[ | 85.61 | 92.09 | 78.07 | 76.80 | 89.01 | 3.72 | 1.22 | 91.75 | 84.32 | |
FuseSwin | 87.00 | 94.00 | 79.56 | 78.86 | 90.93 | 56.94 | 73.98 | 93.03 | 86.07 |
方法类别 | 方法名称 | PA/% | IoU/% | 参数量/MB | 计算量/GFLOPs | mPA/% | mIoU/% | ||
---|---|---|---|---|---|---|---|---|---|
蚝排 | 陆地 | 蚝排 | 陆地 | ||||||
CNN-base | FCN[ | 84.32 | 98.23 | 70.56 | 95.84 | 47.13 | 5.49 | 91.28 | 83.20 |
PSPNet[ | 82.63 | 97.94 | 69.85 | 96.94 | 46.60 | 5.32 | 90.29 | 83.40 | |
DeepLabV3[ | 84.45 | 94.68 | 82.12 | 92.13 | 65.74 | 6.36 | 89.57 | 87.13 | |
Transformer-base | SETR[ | 86.85 | 95.20 | 72.37 | 95.59 | 310.65 | 40.66 | 91.03 | 83.98 |
Segmenter[ | 90.64 | 97.31 | 81.56 | 93.20 | 102.39 | 13.42 | 93.98 | 87.38 | |
SegFormer[ | 91.86 | 95.19 | 88.76 | 92.74 | 3.72 | 1.22 | 93.53 | 90.75 | |
FuseSwin | 96.21 | 98.11 | 91.70 | 96.34 | 56.94 | 73.98 | 97.16 | 94.02 |
Tab. 2 Comparison results of different methods on oyster rafts dataset
方法类别 | 方法名称 | PA/% | IoU/% | 参数量/MB | 计算量/GFLOPs | mPA/% | mIoU/% | ||
---|---|---|---|---|---|---|---|---|---|
蚝排 | 陆地 | 蚝排 | 陆地 | ||||||
CNN-base | FCN[ | 84.32 | 98.23 | 70.56 | 95.84 | 47.13 | 5.49 | 91.28 | 83.20 |
PSPNet[ | 82.63 | 97.94 | 69.85 | 96.94 | 46.60 | 5.32 | 90.29 | 83.40 | |
DeepLabV3[ | 84.45 | 94.68 | 82.12 | 92.13 | 65.74 | 6.36 | 89.57 | 87.13 | |
Transformer-base | SETR[ | 86.85 | 95.20 | 72.37 | 95.59 | 310.65 | 40.66 | 91.03 | 83.98 |
Segmenter[ | 90.64 | 97.31 | 81.56 | 93.20 | 102.39 | 13.42 | 93.98 | 87.38 | |
SegFormer[ | 91.86 | 95.19 | 88.76 | 92.74 | 3.72 | 1.22 | 93.53 | 90.75 | |
FuseSwin | 96.21 | 98.11 | 91.70 | 96.34 | 56.94 | 73.98 | 97.16 | 94.02 |
实验序号 | AEM | 多尺度特征融合 | ASPP | mPA/% | mIoU/% |
---|---|---|---|---|---|
① | × | √ | √ | 96.41 | 93.20 |
② | √ | × | √ | 89.60 | 81.11 |
③ | √ | √ | × | 96.80 | 93.78 |
④ | √ | × | × | 79.63 | 75.56 |
⑤ | √ | √ | √ | 97.16 | 94.02 |
Tab. 3 Results of ablation experiments
实验序号 | AEM | 多尺度特征融合 | ASPP | mPA/% | mIoU/% |
---|---|---|---|---|---|
① | × | √ | √ | 96.41 | 93.20 |
② | √ | × | √ | 89.60 | 81.11 |
③ | √ | √ | × | 96.80 | 93.78 |
④ | √ | × | × | 79.63 | 75.56 |
⑤ | √ | √ | √ | 97.16 | 94.02 |
1 | KOTARIDIS I, LAZARIDOU M. Remote sensing image segmentation advances: a meta-analysis [J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2021, 173: 309-322. 10.1016/j.isprsjprs.2021.01.020 |
2 | DOSOViTSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: Transformers for image recognition at scale [EB/OL]. [2023-05-22]. . |
3 | ALEISSAEE A A, KUMAR A, ANWER R M, et al. Transformers in remote sensing: a survey [EB/OL]. [2023-02-11]. . 10.3390/rs15071860 |
4 | NASEER M, RANASINGHE K, KHAN S H, et al. Intriguing properties of vision Transformers [J]. Advances in Neural Information Processing Systems, 2021, 34: 23296-23308. |
5 | 傅励瑶,尹梦晓,杨锋.基于Transformer的U型医学图像分割网络综述[J].计算机应用,2023,43(5):1584-1595. |
FU L Y, YIN M X, YANG F. Transformer based U-shaped medical image segmentation network: a survey [J]. Journal of Computer Applications, 2023, 43(5): 1584-1595. | |
6 | 王利,宣士斌,秦续阳,等.基于双解码器的Transformer多目标跟踪方法[J].计算机应用,2023, 43(6): 1919-1929. |
WANG L, XUAN S B, QIN X Y, et al. Multi-object tracking method based on dual-decoder Transformer [J]. Journal of Computer Applications, 2023, 43(6): 1919-1929. | |
7 | XU Z, ZHANG W, ZHANG T, et al. Efficient Transformer for remote sensing image segmentation [J]. Remote Sensing, 2021, 13(18): 3585. 10.3390/rs13183585 |
8 | YUAN X, SHI J, GU L. A review of deep learning methods for semantic segmentation of remote sensing imagery [J]. Expert Systems with Applications, 2021, 169: 114417. 10.1016/j.eswa.2020.114417 |
9 | ZHAO T, XU J, CHEN R, et al. Remote sensing image segmentation based on the fuzzy deep convolutional neural network[J]. International Journal of Remote Sensing, 2021, 42(16): 6264-6283. 10.1080/01431161.2021.1938738 |
10 | LIN T-Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection [C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 936-944. 10.1109/cvpr.2017.106 |
11 | KIRILLOV A, GIRSHICK R, HE K, et al. Panoptic feature pyramid networks [C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 6392-6401. 10.1109/cvpr.2019.00656 |
12 | LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation [C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2015: 3431-3440. 10.1109/cvpr.2015.7298965 |
13 | RONNEBERGER O, FISCHER P, BROX T. U-net: convolutional networks for biomedical image segmentation [C]// Proceedings of the 2015 International Conference on Medical Image Computing and Computer-Assisted Intervention, LNCS 9351. Cham: Springer, 2015: 234-241. |
14 | ZHAO H, SHI J, QI X, et al. Pyramid scene parsing network [C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 6230-6239. 10.1109/cvpr.2017.660 |
15 | CHEN L-C, PAPANDREOU G, KOKKINOS I, et al. Semantic image segmentation with deep convolutional nets and fully connected CRFs [EB/OL]. (2014-12-22) [2023-01-10]. . 10.1109/tpami.2017.2699184 |
16 | CHEN L-C, PAPANDREOU G, KOKKINOS I, et al. DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4): 834-848. 10.1109/tpami.2017.2699184 |
17 | CHEN L-C, PAPANDREOU G, SCHROFF F, et al. Rethinking atrous convolution for semantic image segmentation [EB/OL].[2023-01-10]. . 10.1007/978-3-030-01234-2_49 |
18 | CHEN L-C, ZHU Y, PAPANDREOU G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation[C]// Proceedings of the 15th European Conference on Computer Vision. Cham: Springer, 2018: 833-851. 10.1007/978-3-030-01234-2_49 |
19 | ZHENG S, LU J, ZHAO H, et al. Rethinking semantic segmentation from a sequence-to-sequence perspective with Transformers [C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 6877-6886. 10.1109/cvpr46437.2021.00681 |
20 | STRUDEL R, GARCIA R, LAPTEV I, et al. Segmenter: Transformer for semantic segmentation [C]// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 7242-7252. 10.1109/iccv48922.2021.00717 |
21 | XIE E, WANG W, YU Z, et al. SegFormer: simple and efficient design for semantic segmentation with Transformers [J]. Advances in Neural Information Processing Systems, 2021, 34: 12077-12090. |
22 | LIU Z, LIN Y T, CAO Y, et al. Swin transformer: hierarchical vision Transformer using shifted windows [C]// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 10012-10022. 10.1109/iccv48922.2021.00986 |
23 | HU J, SHEN L, SUN G. Squeeze-and-excitation networks [C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 7132-7141. 10.1109/cvpr.2018.00745 |
24 | International Society for Photogrammetry and Remote Sensing. 2D semantic labeling contest — Potsdam [DB/OL]. [2023-06-21].. |
[1] | Yexin PAN, Zhe YANG. Optimization model for small object detection based on multi-level feature bidirectional fusion [J]. Journal of Computer Applications, 2024, 44(9): 2871-2877. |
[2] | Yan RONG, Jiawen LIU, Xinlei LI. Adaptive hybrid network for affective computing in student classroom [J]. Journal of Computer Applications, 2024, 44(9): 2919-2930. |
[3] | Tong CHEN, Fengyu YANG, Yu XIONG, Hong YAN, Fuxing QIU. Construction method of voiceprint library based on multi-scale frequency-channel attention fusion [J]. Journal of Computer Applications, 2024, 44(8): 2407-2413. |
[4] | Chenqian LI, Jun LIU. Ultrasound carotid plaque segmentation method based on semi-supervision and multi-scale cascaded attention [J]. Journal of Computer Applications, 2024, 44(8): 2604-2610. |
[5] | Yuan TANG, Yanping CHEN, Ying HU, Ruizhang HUANG, Yongbin QIN. Relation extraction model based on multi-scale hybrid attention convolutional neural networks [J]. Journal of Computer Applications, 2024, 44(7): 2011-2017. |
[6] | Sailong SHI, Zhiwen FANG. Gaze estimation model based on multi-scale aggregation and shared attention [J]. Journal of Computer Applications, 2024, 44(7): 2047-2054. |
[7] | Wu XIONG, Congjun CAO, Xuefang SONG, Yunlong SHAO, Xusheng WANG. Handwriting identification method based on multi-scale mixed domain attention mechanism [J]. Journal of Computer Applications, 2024, 44(7): 2225-2232. |
[8] | Li LIU, Haijin HOU, Anhong WANG, Tao ZHANG. Generative data hiding algorithm based on multi-scale attention [J]. Journal of Computer Applications, 2024, 44(7): 2102-2109. |
[9] | Ruihua LIU, Zihe HAO, Yangyang ZOU. Gait recognition algorithm based on multi-layer refined feature fusion [J]. Journal of Computer Applications, 2024, 44(7): 2250-2257. |
[10] | Wei LI, Xiaorong ZHANG, Peng CHEN, Qing LI, Changqing ZHANG. Crowd counting algorithm with multi-scale fusion based on normal inverse Gamma distribution [J]. Journal of Computer Applications, 2024, 44(7): 2243-2249. |
[11] | Yue LIU, Fang LIU, Aoyun WU, Qiuyue CHAI, Tianxiao WANG. 3D object detection network based on self-attention mechanism and graph convolution [J]. Journal of Computer Applications, 2024, 44(6): 1972-1977. |
[12] | Mei WANG, Xuesong SU, Jia LIU, Ruonan YIN, Shan HUANG. Time series classification method based on multi-scale cross-attention fusion in time-frequency domain [J]. Journal of Computer Applications, 2024, 44(6): 1842-1847. |
[13] | Xiaohui CHENG, Yuntian HUANG, Ruifang ZHANG. Lightweight infrared road scene detection model based on multiscale and weighted coordinate attention [J]. Journal of Computer Applications, 2024, 44(6): 1927-1934. |
[14] | Mengyuan HUANG, Kan CHANG, Mingyang LING, Xinjie WEI, Tuanfa QIN. Progressive enhancement algorithm for low-light images based on layer guidance [J]. Journal of Computer Applications, 2024, 44(6): 1911-1919. |
[15] | Guijin HAN, Xinyuan ZHANG, Wentao ZHANG, Ya HUANG. Self-supervised image registration algorithm based on multi-feature fusion [J]. Journal of Computer Applications, 2024, 44(5): 1597-1604. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||