Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (3): 936-942.DOI: 10.11772/j.issn.1001-9081.2022020210
• Multimedia computing and computer simulation • Previous Articles
Xuedong HE1, Shibin XUAN1,2(), Kuan WANG1, Mengnan CHEN1
Received:
2022-02-24
Revised:
2022-05-25
Accepted:
2022-05-25
Online:
2022-08-16
Published:
2023-03-10
Contact:
Shibin XUAN
About author:
HE Xuedong, born in 1997, M. S. candidate. His research interests include semantic segmentation, computer vision.Supported by:
通讯作者:
宣士斌
作者简介:
何雪东(1997—),男,吉林松原人,硕士研究生,CCF会员,主要研究方向:语义分割、计算机视觉基金资助:
CLC Number:
Xuedong HE, Shibin XUAN, Kuan WANG, Mengnan CHEN. DeepLabV3+ image segmentation algorithm fusing cumulative distribution function and channel attention mechanism[J]. Journal of Computer Applications, 2023, 43(3): 936-942.
何雪东, 宣士斌, 王款, 陈梦楠. 融合累积分布函数和通道注意力机制的DeepLabV3+图像分割算法[J]. 《计算机应用》唯一官方网站, 2023, 43(3): 936-942.
Add to citation manager EndNote|Ris|BibTeX
URL: http://www.joca.cn/EN/10.11772/j.issn.1001-9081.2022020210
软硬件配置 | 配置详情 |
---|---|
CPU | Intel Xeon Silver 4114 |
内存 | 256 GB |
显卡 | RTX 8000 |
操作系统 | Ubuntu 20.04 |
CUDA | Cuda 11.4 |
Python | Python 3.6 |
Pytorch | Pytorch 1.8.1 |
Tab. 1 Machine software and hardware configuration
软硬件配置 | 配置详情 |
---|---|
CPU | Intel Xeon Silver 4114 |
内存 | 256 GB |
显卡 | RTX 8000 |
操作系统 | Ubuntu 20.04 |
CUDA | Cuda 11.4 |
Python | Python 3.6 |
Pytorch | Pytorch 1.8.1 |
1/8 | 1/16 | ASPP空洞率 | mIoU/% | 浮点运算量/GFLOPs | 参数量/106 |
---|---|---|---|---|---|
— | — | (6,12,18) | 78.85 | 92.82 | 59.23 |
| — | 79.30 | 93.66 | 59.43 | |
— | | 79.02 | 108.44 | 60.94 | |
| | 79.56 | 121.71 | 62.18 | |
— | — | (4,8,12,16) | 79.14 | 98.03 | 64.02 |
| — | 80.09 | 98.87 | 64.22 | |
— | | 79.29 | 113.65 | 65.72 | |
| | 79.80 | 126.92 | 66.96 |
Tab. 2 Influence of CDCA module and ASPP atrous rate on network
1/8 | 1/16 | ASPP空洞率 | mIoU/% | 浮点运算量/GFLOPs | 参数量/106 |
---|---|---|---|---|---|
— | — | (6,12,18) | 78.85 | 92.82 | 59.23 |
| — | 79.30 | 93.66 | 59.43 | |
— | | 79.02 | 108.44 | 60.94 | |
| | 79.56 | 121.71 | 62.18 | |
— | — | (4,8,12,16) | 79.14 | 98.03 | 64.02 |
| — | 80.09 | 98.87 | 64.22 | |
— | | 79.29 | 113.65 | 65.72 | |
| | 79.80 | 126.92 | 66.96 |
模型 | mIoU/% | 浮点运算量/GFLOPs | 参数量/106 | 训练时间/h |
---|---|---|---|---|
DeepLabV2 | 76.35 | 75.40 | 61.41 | — |
DeepLabV3 | 77.21 | 71.16 | 58.04 | — |
DeepLabV3+ | 78.85 | 92.93 | 59.23 | 9.8 |
改进DeepLabV3+ | 79.97 | 99.53 | 64.65 | — |
模型1 | 79.30 | 93.66 | 59.43 | 10.0 |
模型2 | 80.09 | 98.87 | 64.22 | 11.2 |
Tab. 3 Comparison results of different models
模型 | mIoU/% | 浮点运算量/GFLOPs | 参数量/106 | 训练时间/h |
---|---|---|---|---|
DeepLabV2 | 76.35 | 75.40 | 61.41 | — |
DeepLabV3 | 77.21 | 71.16 | 58.04 | — |
DeepLabV3+ | 78.85 | 92.93 | 59.23 | 9.8 |
改进DeepLabV3+ | 79.97 | 99.53 | 64.65 | — |
模型1 | 79.30 | 93.66 | 59.43 | 10.0 |
模型2 | 80.09 | 98.87 | 64.22 | 11.2 |
模型 | mIoU |
---|---|
DeepLab V3+ | 79.09 |
模型1 | 79.68 |
模型2 | 80.11 |
Tab. 4 mIoU comparison on Cityscapes dataset
模型 | mIoU |
---|---|
DeepLab V3+ | 79.09 |
模型1 | 79.68 |
模型2 | 80.11 |
1 | YAN H T, ZHANG C, WU M. Lawin Transformer: improving semantic segmentation transformer with multi-scale representations via large window attention[EB/OL]. (2022-01-05) [2022-02-11].. 10.48550/arXiv.2201.01615 |
2 | 田萱,王亮,丁琪. 基于深度学习的图像语义分割方法综述[J]. 软件学报, 2019, 30(2):440-468. 10.13328/j.cnki.jos.005659 |
TIAN X, WANG L, DING Q. Review of image semantic segmentation based on deep learning[J]. Journal of Software, 2019, 30(2): 440-468. 10.13328/j.cnki.jos.005659 | |
3 | 王龙飞,严春满. 道路场景语义分割综述[J]. 激光与光电子学进展, 2021, 58(12): No.1200002. 10.3788/lop202158.1200002 |
WANG L F, YAN C M. Review on semantic segmentation of road scenes[J]. Laser and Optoelectronics Progress, 2021, 58(12): No.1200002. 10.3788/lop202158.1200002 | |
4 | PANELLA F, LIPANI A, BOEHM J. Semantic segmentation of cracks: data challenges and architecture[J]. Automation in Construction, 2022, 135: No.104110. 10.1016/j.autcon.2021.104110 |
5 | MINAEE S, BOYKOV Y, PORIKLI F, et al. Image segmentation using deep learning: a survey[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(7): 3523-3542. |
6 | LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation[C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2015: 3431-3440. 10.1109/cvpr.2015.7298965 |
7 | GUO M H, LIU Z N, MU T J, et al. Beyond self-attention: external attention using two linear layers for visual tasks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022(Early Access): 1-13. 10.1109/tpami.2022.3211006 |
8 | GUO M H, XU T X, LIU J J, et al. Attention mechanisms in computer vision: a survey[J]. Computational Visual Media, 2022, 8(3): 331-368. 10.1007/s41095-022-0271-y |
9 | FAN M Y, LAI S Q, HUANG J S, et al. Rethinking BiSeNet for real-time semantic segmentation[C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 9711-9720. 10.1109/cvpr46437.2021.00959 |
10 | HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 7132-7141. 10.1109/cvpr.2018.00745 |
11 | CHEN L C, PAPANDREOU G, KOKKINOS I, et al. Semantic image segmentation with deep convolutional nets and fully connected CRFs[EB/OL]. (2016-06-07) [2022-02-10].. 10.1109/tpami.2017.2699184 |
12 | CHEN L C, PAPANDREOU G, KOKKINOS I, et al. DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4): 834-848. 10.1109/tpami.2017.2699184 |
13 | CHEN L C, PAPANDREOU G, SCHROFF F, et al. Rethinking atrous convolution for semantic image segmentation[EB/OL]. (2017-12-05) [2022-02-11].. 10.1007/978-3-030-01234-2_49 |
14 | CHEN L C, ZHU Y K, PAPANDREOU G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation[C]// Proceedings of the 2018 European Conference on Computer Vision, LNCS 11211. Cham: Springer, 2018: 833-851. 10.1007/978-3-030-01234-2_49 |
15 | WANG Q L, WU B G, ZHU P F, et al. ECA-Net: efficient channel attention for deep convolutional neural networks[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 11531-11539. 10.1109/cvpr42600.2020.01155 |
16 | WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]// Proceedings of the 2018 European Conference on Computer Vision, LNCS 11211. Cham: Springer, 2018: 3-19. |
17 | 杨贞,彭小宝,朱强强,等. 基于Deeplab V3 plus的自适应注意力机制图像分割算法[J]. 计算机应用, 2022, 42(1):230-238. |
YANG Z, PENG X B, ZHU Q Q, et al. Image segmentation algorithm with adaptive attention mechanism based on Deeplab V3 Plus[J]. Journal of Computer Applications, 2022, 42(1): 230-238. | |
18 | YU F, KOLTUN V, FUNKHOUSER T. Dilated residual networks[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 636-644. 10.1109/cvpr.2017.75 |
19 | 张蕊,李锦涛. 基于深度学习的场景分割算法研究综述[J]. 计算机研究与发展, 2020, 57(4):859-875. 10.7544/issn1000-1239.2020.20190513 |
ZHANG R, LI J T. A survey on algorithm research of scene parsing based on deep learning[J]. Journal of Computer Research and Development, 2020, 57(4): 859-875. 10.7544/issn1000-1239.2020.20190513 | |
20 | HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 770-778. 10.1109/cvpr.2016.90 |
21 | HOWARD A G, ZHU M L, CHEN B, et al. MobileNets: efficient convolutional neural networks for mobile vision applications[EB/OL]. (2017-04-17) [2022-02-13].. 10.48550/arXiv.1704.04861 |
22 | SANDLER M, HOWARD A, ZHU M L, et al. MobileNetV2: inverted residuals and linear bottlenecks[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 4510-4520. 10.1109/cvpr.2018.00474 |
23 | HOWARD A, SANDLER M, CHEN B, et al. Searching for MobileNetV3[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 1314-1324. 10.1109/iccv.2019.00140 |
24 | CHOLLET F. Xception: deep learning with depthwise separable convolutions[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 1800-1807. 10.1109/cvpr.2017.195 |
25 | SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[EB/OL]. (2015-04-10) [2021-12-20].. |
26 | 程晓悦,赵龙章,胡穹,等. 基于膨胀卷积平滑及轻型上采样的实时语义分割[J]. 激光与光电子学进展, 2020, 57(2): No.021017. 10.3788/lop57.021017 |
CHENG X Y, ZHAO L Z, HU Q, et al. Real-time semantic segmentation based on dilated convolution smoothing and lightweight up-sampling[J]. Laser and Optoelectronics Progress, 2020, 57(2): No.021017. 10.3788/lop57.021017 | |
27 | HE K M, ZHANG X Y, REN S Q, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904-1916. 10.1109/tpami.2015.2389824 |
28 | 徐聪,王丽. 基于改进DeepLabv3+网络的图像语义分割方法[J]. 激光与光电子学进展, 2021, 58(16): No.1610008. 10.3788/lop202158.1610008 |
XU C, WANG L. Image semantic segmentation method based on improved DeepLabv3+ network[J]. Laser and Optoelectronics Progress, 2021, 58(16): No.1610008. 10.3788/lop202158.1610008 | |
29 | LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 936-944. 10.1109/cvpr.2017.106 |
30 | EVERINGHAM M, ESLAMI S M A, VAN GOOL L, et al. The PASCAL visual object classes challenge: a retrospective[J]. International Journal of Computer Vision, 2015, 111(1): 98-136. 10.1007/s11263-014-0733-5 |
31 | CORDTS M, OMRAN M, RAMOS S, et al. The Cityscapes dataset for semantic urban scene understanding[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 3213-3223. 10.1109/cvpr.2016.350 |
32 | OpenMMLab. MMSegmentation[CP/OL]. [2021-10-10].. |
33 | XIE E Z, WANG W H, YU Z D, et al. SegFormer: simple and efficient design for semantic segmentation with Transformers[EB/OL]. (2021-10-28) [2022-02-12].. |
[1] | Rongjun CHEN, Xuanhui YAN, Chaocheng YANG. Fusion imaging-based recurrent capsule classification network for time series [J]. Journal of Computer Applications, 2023, 43(3): 692-699. |
[2] | Jiangfeng ZHANG, Tao YAN, Bin CHEN, Yuhua QIAN, Yantao SONG. Multi-depth-of-field 3D shape reconstruction with global spatio-temporal feature coupling [J]. Journal of Computer Applications, 2023, 43(3): 894-902. |
[3] | Boyi FU, Yuncong PENG, Xin LAN, Xiaolin QIN. Survey of label noise learning algorithms based on deep learning [J]. Journal of Computer Applications, 2023, 43(3): 674-684. |
[4] | Li’an ZHU, Hong ZHANG. Nonhomogeneous image dehazing based on dual-branch conditional generative adversarial network [J]. Journal of Computer Applications, 2023, 43(2): 567-574. |
[5] | Qi WANG, Hang LEI, Xupeng WANG. Deep face verification under pose interference [J]. Journal of Computer Applications, 2023, 43(2): 595-600. |
[6] | Ping WANG, Nan CHEN, Lei LU. Fall detection algorithm based on scene prior and attention guidance [J]. Journal of Computer Applications, 2023, 43(2): 529-535. |
[7] | Youxin WANG, Bin CHEN. Print defect detection method based on deep comparison network [J]. Journal of Computer Applications, 2023, 43(1): 250-258. |
[8] | Zhijun SHEN, Lina MU, Jing GAO, Yuanhang SHI, Zhiqiang LIU. Review of fine-grained image categorization [J]. Journal of Computer Applications, 2023, 43(1): 51-60. |
[9] | Keyou GUO, Xue LI, Min YANG. Real‑time detection method of traffic information based on lightweight YOLOv4 [J]. Journal of Computer Applications, 2023, 43(1): 74-80. |
[10] | Yangping LIN, Jia LIU, Pei CHEN, Mingshu ZHANG, Xiaoyuan YANG. Semi-generative video steganography scheme based on deep convolutional generative adversarial net [J]. Journal of Computer Applications, 2023, 43(1): 169-175. |
[11] | Jun ZHANG, Pengli WU, Lukui SHI, Jin SHI, Bin PAN. Deep learning model for multi-station temperature prediction combined with MOD11A1 and surface meteorological station data [J]. Journal of Computer Applications, 2023, 43(1): 321-328. |
[12] | Jinghan YIN, Shaojun QU, Zekai YAO, Xuanye HU, Xiaoyu QIN, Pujing HUA. Traffic sign recognition model in haze weather based on YOLOv5 [J]. Journal of Computer Applications, 2022, 42(9): 2876-2884. |
[13] | Jiaxuan WEI, Shikang DU, Zhixuan YU, Ruisheng ZHANG. Review of white-box adversarial attack technologies in image classification [J]. Journal of Computer Applications, 2022, 42(9): 2732-2741. |
[14] | Jinghu LI, Qianguo XING, Xiangyang ZHENG, Lin LI, Lili WANG. Noctiluca scintillans red tide extraction method from UAV images based on deep learning [J]. Journal of Computer Applications, 2022, 42(9): 2969-2974. |
[15] | Yining WANG, Qingshan ZHAO, Pinle QIN, Yulan HU, Chunmei ZONG. Super-resolution reconstruction algorithm of medical image based on lightweight dense neural network [J]. Journal of Computer Applications, 2022, 42(8): 2586-2592. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||