Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (3): 936-942.DOI: 10.11772/j.issn.1001-9081.2022020210
Special Issue: 多媒体计算与计算机仿真
• Multimedia computing and computer simulation • Previous Articles Next Articles
					
						                                                                                                                                                                                                                                                    Xuedong HE1, Shibin XUAN1,2( ), Kuan WANG1, Mengnan CHEN1
), Kuan WANG1, Mengnan CHEN1
												  
						
						
						
					
				
Received:2022-02-24
															
							
																	Revised:2022-05-25
															
							
																	Accepted:2022-05-25
															
							
							
																	Online:2022-08-16
															
							
																	Published:2023-03-10
															
							
						Contact:
								Shibin XUAN   
													About author:HE Xuedong, born in 1997, M. S. candidate. His research interests include semantic segmentation, computer vision.Supported by:通讯作者:
					宣士斌
							作者简介:何雪东(1997—),男,吉林松原人,硕士研究生,CCF会员,主要研究方向:语义分割、计算机视觉基金资助:CLC Number:
Xuedong HE, Shibin XUAN, Kuan WANG, Mengnan CHEN. DeepLabV3+ image segmentation algorithm fusing cumulative distribution function and channel attention mechanism[J]. Journal of Computer Applications, 2023, 43(3): 936-942.
何雪东, 宣士斌, 王款, 陈梦楠. 融合累积分布函数和通道注意力机制的DeepLabV3+图像分割算法[J]. 《计算机应用》唯一官方网站, 2023, 43(3): 936-942.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2022020210
| 软硬件配置 | 配置详情 | 
|---|---|
| CPU | Intel Xeon Silver 4114 | 
| 内存 | 256 GB | 
| 显卡 | RTX 8000 | 
| 操作系统 | Ubuntu 20.04 | 
| CUDA | Cuda 11.4 | 
| Python | Python 3.6 | 
| Pytorch | Pytorch 1.8.1 | 
Tab. 1 Machine software and hardware configuration
| 软硬件配置 | 配置详情 | 
|---|---|
| CPU | Intel Xeon Silver 4114 | 
| 内存 | 256 GB | 
| 显卡 | RTX 8000 | 
| 操作系统 | Ubuntu 20.04 | 
| CUDA | Cuda 11.4 | 
| Python | Python 3.6 | 
| Pytorch | Pytorch 1.8.1 | 
| 1/8 | 1/16 | ASPP空洞率 | mIoU/% | 浮点运算量/GFLOPs | 参数量/106 | 
|---|---|---|---|---|---|
| — | — | (6,12,18) | 78.85 | 92.82 | 59.23 | 
|  | — | 79.30 | 93.66 | 59.43 | |
| — |  | 79.02 | 108.44 | 60.94 | |
|  |  | 79.56 | 121.71 | 62.18 | |
| — | — | (4,8,12,16) | 79.14 | 98.03 | 64.02 | 
|  | — | 80.09 | 98.87 | 64.22 | |
| — |  | 79.29 | 113.65 | 65.72 | |
|  |  | 79.80 | 126.92 | 66.96 | 
Tab. 2 Influence of CDCA module and ASPP atrous rate on network
| 1/8 | 1/16 | ASPP空洞率 | mIoU/% | 浮点运算量/GFLOPs | 参数量/106 | 
|---|---|---|---|---|---|
| — | — | (6,12,18) | 78.85 | 92.82 | 59.23 | 
|  | — | 79.30 | 93.66 | 59.43 | |
| — |  | 79.02 | 108.44 | 60.94 | |
|  |  | 79.56 | 121.71 | 62.18 | |
| — | — | (4,8,12,16) | 79.14 | 98.03 | 64.02 | 
|  | — | 80.09 | 98.87 | 64.22 | |
| — |  | 79.29 | 113.65 | 65.72 | |
|  |  | 79.80 | 126.92 | 66.96 | 
| 模型 | mIoU/% | 浮点运算量/GFLOPs | 参数量/106 | 训练时间/h | 
|---|---|---|---|---|
| DeepLabV2 | 76.35 | 75.40 | 61.41 | — | 
| DeepLabV3 | 77.21 | 71.16 | 58.04 | — | 
| DeepLabV3+ | 78.85 | 92.93 | 59.23 | 9.8 | 
| 改进DeepLabV3+ | 79.97 | 99.53 | 64.65 | — | 
| 模型1 | 79.30 | 93.66 | 59.43 | 10.0 | 
| 模型2 | 80.09 | 98.87 | 64.22 | 11.2 | 
Tab. 3 Comparison results of different models
| 模型 | mIoU/% | 浮点运算量/GFLOPs | 参数量/106 | 训练时间/h | 
|---|---|---|---|---|
| DeepLabV2 | 76.35 | 75.40 | 61.41 | — | 
| DeepLabV3 | 77.21 | 71.16 | 58.04 | — | 
| DeepLabV3+ | 78.85 | 92.93 | 59.23 | 9.8 | 
| 改进DeepLabV3+ | 79.97 | 99.53 | 64.65 | — | 
| 模型1 | 79.30 | 93.66 | 59.43 | 10.0 | 
| 模型2 | 80.09 | 98.87 | 64.22 | 11.2 | 
| 模型 | mIoU | 
|---|---|
| DeepLab V3+ | 79.09 | 
| 模型1 | 79.68 | 
| 模型2 | 80.11 | 
Tab. 4 mIoU comparison on Cityscapes dataset
| 模型 | mIoU | 
|---|---|
| DeepLab V3+ | 79.09 | 
| 模型1 | 79.68 | 
| 模型2 | 80.11 | 
| 1 | YAN H T, ZHANG C, WU M. Lawin Transformer: improving semantic segmentation transformer with multi-scale representations via large window attention[EB/OL]. (2022-01-05) [2022-02-11].. 10.48550/arXiv.2201.01615 | 
| 2 | 田萱,王亮,丁琪. 基于深度学习的图像语义分割方法综述[J]. 软件学报, 2019, 30(2):440-468. 10.13328/j.cnki.jos.005659 | 
| TIAN X, WANG L, DING Q. Review of image semantic segmentation based on deep learning[J]. Journal of Software, 2019, 30(2): 440-468. 10.13328/j.cnki.jos.005659 | |
| 3 | 王龙飞,严春满. 道路场景语义分割综述[J]. 激光与光电子学进展, 2021, 58(12): No.1200002. 10.3788/lop202158.1200002 | 
| WANG L F, YAN C M. Review on semantic segmentation of road scenes[J]. Laser and Optoelectronics Progress, 2021, 58(12): No.1200002. 10.3788/lop202158.1200002 | |
| 4 | PANELLA F, LIPANI A, BOEHM J. Semantic segmentation of cracks: data challenges and architecture[J]. Automation in Construction, 2022, 135: No.104110. 10.1016/j.autcon.2021.104110 | 
| 5 | MINAEE S, BOYKOV Y, PORIKLI F, et al. Image segmentation using deep learning: a survey[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(7): 3523-3542. | 
| 6 | LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation[C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2015: 3431-3440. 10.1109/cvpr.2015.7298965 | 
| 7 | GUO M H, LIU Z N, MU T J, et al. Beyond self-attention: external attention using two linear layers for visual tasks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022(Early Access): 1-13. 10.1109/tpami.2022.3211006 | 
| 8 | GUO M H, XU T X, LIU J J, et al. Attention mechanisms in computer vision: a survey[J]. Computational Visual Media, 2022, 8(3): 331-368. 10.1007/s41095-022-0271-y | 
| 9 | FAN M Y, LAI S Q, HUANG J S, et al. Rethinking BiSeNet for real-time semantic segmentation[C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 9711-9720. 10.1109/cvpr46437.2021.00959 | 
| 10 | HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 7132-7141. 10.1109/cvpr.2018.00745 | 
| 11 | CHEN L C, PAPANDREOU G, KOKKINOS I, et al. Semantic image segmentation with deep convolutional nets and fully connected CRFs[EB/OL]. (2016-06-07) [2022-02-10].. 10.1109/tpami.2017.2699184 | 
| 12 | CHEN L C, PAPANDREOU G, KOKKINOS I, et al. DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4): 834-848. 10.1109/tpami.2017.2699184 | 
| 13 | CHEN L C, PAPANDREOU G, SCHROFF F, et al. Rethinking atrous convolution for semantic image segmentation[EB/OL]. (2017-12-05) [2022-02-11].. 10.1007/978-3-030-01234-2_49 | 
| 14 | CHEN L C, ZHU Y K, PAPANDREOU G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation[C]// Proceedings of the 2018 European Conference on Computer Vision, LNCS 11211. Cham: Springer, 2018: 833-851. 10.1007/978-3-030-01234-2_49 | 
| 15 | WANG Q L, WU B G, ZHU P F, et al. ECA-Net: efficient channel attention for deep convolutional neural networks[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 11531-11539. 10.1109/cvpr42600.2020.01155 | 
| 16 | WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]// Proceedings of the 2018 European Conference on Computer Vision, LNCS 11211. Cham: Springer, 2018: 3-19. | 
| 17 | 杨贞,彭小宝,朱强强,等. 基于Deeplab V3 plus的自适应注意力机制图像分割算法[J]. 计算机应用, 2022, 42(1):230-238. | 
| YANG Z, PENG X B, ZHU Q Q, et al. Image segmentation algorithm with adaptive attention mechanism based on Deeplab V3 Plus[J]. Journal of Computer Applications, 2022, 42(1): 230-238. | |
| 18 | YU F, KOLTUN V, FUNKHOUSER T. Dilated residual networks[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 636-644. 10.1109/cvpr.2017.75 | 
| 19 | 张蕊,李锦涛. 基于深度学习的场景分割算法研究综述[J]. 计算机研究与发展, 2020, 57(4):859-875. 10.7544/issn1000-1239.2020.20190513 | 
| ZHANG R, LI J T. A survey on algorithm research of scene parsing based on deep learning[J]. Journal of Computer Research and Development, 2020, 57(4): 859-875. 10.7544/issn1000-1239.2020.20190513 | |
| 20 | HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 770-778. 10.1109/cvpr.2016.90 | 
| 21 | HOWARD A G, ZHU M L, CHEN B, et al. MobileNets: efficient convolutional neural networks for mobile vision applications[EB/OL]. (2017-04-17) [2022-02-13].. 10.48550/arXiv.1704.04861 | 
| 22 | SANDLER M, HOWARD A, ZHU M L, et al. MobileNetV2: inverted residuals and linear bottlenecks[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 4510-4520. 10.1109/cvpr.2018.00474 | 
| 23 | HOWARD A, SANDLER M, CHEN B, et al. Searching for MobileNetV3[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 1314-1324. 10.1109/iccv.2019.00140 | 
| 24 | CHOLLET F. Xception: deep learning with depthwise separable convolutions[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 1800-1807. 10.1109/cvpr.2017.195 | 
| 25 | SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[EB/OL]. (2015-04-10) [2021-12-20].. | 
| 26 | 程晓悦,赵龙章,胡穹,等. 基于膨胀卷积平滑及轻型上采样的实时语义分割[J]. 激光与光电子学进展, 2020, 57(2): No.021017. 10.3788/lop57.021017 | 
| CHENG X Y, ZHAO L Z, HU Q, et al. Real-time semantic segmentation based on dilated convolution smoothing and lightweight up-sampling[J]. Laser and Optoelectronics Progress, 2020, 57(2): No.021017. 10.3788/lop57.021017 | |
| 27 | HE K M, ZHANG X Y, REN S Q, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904-1916. 10.1109/tpami.2015.2389824 | 
| 28 | 徐聪,王丽. 基于改进DeepLabv3+网络的图像语义分割方法[J]. 激光与光电子学进展, 2021, 58(16): No.1610008. 10.3788/lop202158.1610008 | 
| XU C, WANG L. Image semantic segmentation method based on improved DeepLabv3+ network[J]. Laser and Optoelectronics Progress, 2021, 58(16): No.1610008. 10.3788/lop202158.1610008 | |
| 29 | LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 936-944. 10.1109/cvpr.2017.106 | 
| 30 | EVERINGHAM M, ESLAMI S M A, VAN GOOL L, et al. The PASCAL visual object classes challenge: a retrospective[J]. International Journal of Computer Vision, 2015, 111(1): 98-136. 10.1007/s11263-014-0733-5 | 
| 31 | CORDTS M, OMRAN M, RAMOS S, et al. The Cityscapes dataset for semantic urban scene understanding[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 3213-3223. 10.1109/cvpr.2016.350 | 
| 32 | OpenMMLab. MMSegmentation[CP/OL]. [2021-10-10].. | 
| 33 | XIE E Z, WANG W H, YU Z D, et al. SegFormer: simple and efficient design for semantic segmentation with Transformers[EB/OL]. (2021-10-28) [2022-02-12].. | 
| [1] | Yexin PAN, Zhe YANG. Optimization model for small object detection based on multi-level feature bidirectional fusion [J]. Journal of Computer Applications, 2024, 44(9): 2871-2877. | 
| [2] | Yunchuan HUANG, Yongquan JIANG, Juntao HUANG, Yan YANG. Molecular toxicity prediction based on meta graph isomorphism network [J]. Journal of Computer Applications, 2024, 44(9): 2964-2969. | 
| [3] | Jing QIN, Zhiguang QIN, Fali LI, Yueheng PENG. Diagnosis of major depressive disorder based on probabilistic sparse self-attention neural network [J]. Journal of Computer Applications, 2024, 44(9): 2970-2974. | 
| [4] | Xiyuan WANG, Zhancheng ZHANG, Shaokang XU, Baocheng ZHANG, Xiaoqing LUO, Fuyuan HU. Unsupervised cross-domain transfer network for 3D/2D registration in surgical navigation [J]. Journal of Computer Applications, 2024, 44(9): 2911-2918. | 
| [5] | Shunyong LI, Shiyi LI, Rui XU, Xingwang ZHAO. Incomplete multi-view clustering algorithm based on self-attention fusion [J]. Journal of Computer Applications, 2024, 44(9): 2696-2703. | 
| [6] | Yuhan LIU, Genlin JI, Hongping ZHANG. Video pedestrian anomaly detection method based on skeleton graph and mixed attention [J]. Journal of Computer Applications, 2024, 44(8): 2551-2557. | 
| [7] | Yanjie GU, Yingjun ZHANG, Xiaoqian LIU, Wei ZHOU, Wei SUN. Traffic flow forecasting via spatial-temporal multi-graph fusion [J]. Journal of Computer Applications, 2024, 44(8): 2618-2625. | 
| [8] | Qianhong SHI, Yan YANG, Yongquan JIANG, Xiaocao OUYANG, Wubo FAN, Qiang CHEN, Tao JIANG, Yuan LI. Multi-granularity abrupt change fitting network for air quality prediction [J]. Journal of Computer Applications, 2024, 44(8): 2643-2650. | 
| [9] | Zheng WU, Zhiyou CHENG, Zhentian WANG, Chuanjian WANG, Sheng WANG, Hui XU. Deep learning-based classification of head movement amplitude during patient anaesthesia resuscitation [J]. Journal of Computer Applications, 2024, 44(7): 2258-2263. | 
| [10] | Huanhuan LI, Tianqiang HUANG, Xuemei DING, Haifeng LUO, Liqing HUANG. Public traffic demand prediction based on multi-scale spatial-temporal graph convolutional network [J]. Journal of Computer Applications, 2024, 44(7): 2065-2072. | 
| [11] | Zhi ZHANG, Xin LI, Naifu YE, Kaixi HU. DKP: defending against model stealing attacks based on dark knowledge protection [J]. Journal of Computer Applications, 2024, 44(7): 2080-2086. | 
| [12] | Yiqun ZHAO, Zhiyu ZHANG, Xue DONG. Anisotropic travel time computation method based on dense residual connection physical information neural networks [J]. Journal of Computer Applications, 2024, 44(7): 2310-2318. | 
| [13] | Song XU, Wenbo ZHANG, Yifan WANG. Lightweight video salient object detection network based on spatiotemporal information [J]. Journal of Computer Applications, 2024, 44(7): 2192-2199. | 
| [14] | Xun SUN, Ruifeng FENG, Yanru CHEN. Monocular 3D object detection method integrating depth and instance segmentation [J]. Journal of Computer Applications, 2024, 44(7): 2208-2215. | 
| [15] | Yajuan ZHAO, Fanjun MENG, Xingjian XU. Review of online education learner knowledge tracing [J]. Journal of Computer Applications, 2024, 44(6): 1683-1698. | 
| Viewed | ||||||
| Full text |  | |||||
| Abstract |  | |||||