Real-time semantic segmentation method based on squeezing and refining network

doi:10.11772/j.issn.1001-9081.2021050812

Journal of Computer Applications ›› 2022, Vol. 42 ›› Issue (7): 1993-2000.DOI: 10.11772/j.issn.1001-9081.2021050812

Special Issue: 人工智能

• Artificial intelligence • Previous Articles Next Articles

Real-time semantic segmentation method based on squeezing and refining network

Juan WANG¹^,²^,³, Xuliang YUAN¹, Minghu WU¹^,²^,³(), Liquan GUO¹, Zishan LIU¹

^1.School of Electrical and Electronic Engineering，Hubei University of Technology，Wuhan Hubei 430068，China
^2.Hubei Key Laboratory for High?efficiency Utilization of Solar Energy and Operation Control of Energy Storage System （Hubei University of Technology），Wuhan Hubei 430068，China
^3.Postdoctoral Mobile Research Station of Hua’an Technology Company Limited，Wuhan Hubei 430068，China

Received:2021-05-18 Revised:2021-09-22 Accepted:2021-09-24 Online:2022-07-15 Published:2022-07-10
Contact: Minghu WU
About author:WANG Juan， born in 1983， Ph. D.， associate professor. Her research interests include artificial intelligence， computer vision， deep learning.
YUAN Xuliang， born in 1992， M. S. candidate. His research interests include artificial intelligence， computer vision， image segmentation.
GUO Liquan， born in 1997， M. S. candidate. His research interests include artificial intelligence， machine vision.
LIU Zishan， born in 1997， M. S. candidate. Her research interests include artificial intelligence， machine vision.
Supported by:
National Natural Science Foundation of China(62006073)

基于压缩提炼网络的实时语义分割方法

王娟¹^,²^,³, 袁旭亮¹, 武明虎¹^,²^,³(), 郭力权¹, 刘子杉¹

^1.湖北工业大学电气与电子工程学院, 武汉 430068
^2.太阳能高效利用及储能运行控制湖北省重点实验室(湖北工业大学), 武汉 430068
^3.武汉华安科技有股份限公司博士后工作站, 武汉 430068

通讯作者: 武明虎
作者简介:王娟（1983—），女，湖北武汉人，副教授，博士，主要研究方向：人工智能、计算机视觉、深度学习
袁旭亮（1992—），男，广西河池人，硕士研究生，主要研究方向：人工智能、机器视觉、图像分割
郭力权（1997—），男，湖北黄冈人，硕士研究生，主要研究方向：人工智能、机器视觉
刘子杉（1997—），女，湖北武汉人，硕士研究生，主要研究方向：人工智能、机器视觉。
基金资助:
国家自然科学基金资助项目(62006073)

Abstract

Abstract:

Aiming at the problem that the current semantic segmentation algorithms are difficult to reach the balance between real-time reasoning and high-precision segmentation， a Squeezing and Refining Network （SRNet） was proposed to improve real-time performance of reasoning and accuracy of segmentation. Firstly， One-Dimensional （1D） dilated convolution and bottleneck-like structure unit were introduced into Squeezing and Refining （SR） unit， which greatly reduced the amount of calculation and the number of parameters of model. Secondly， the multi-scale Spatial Attention （SA） confusing module was introduced to make use of the spatial information of shallow layer features efficiently. Finally， the encoder was formed through stacking SR units， and two SA units were used to form the decoder. Simulation shows that SRNet obtains 68.3% Mean Intersection over Union （MIoU） on Cityscapes dataset with only 30 MB parameters and 8.8×10⁹ FLoating-point Operation Per Second （FLOPS）. Besides， the model reaches a forward reasoning speed of 12.6 Frames Per Second （FPS） with input pixel size of 512×1 024×3 on a single NVIDIA Titan RTX card. Experimental results imply that the designed lightweight model SRNet reaches a good balance between accurate segmentation and real-time reasoning， and is suitable for scenarios with limited computing power and power consumption.

Key words: semantic segmentation, lightweight network, real-time reasoning, Spatial Attention (SA) confusing module, one-dimensional dilated convolution

摘要：

针对目前语义分割算法难以取得实时推理和高精度分割间平衡的问题，提出压缩提炼网络（SRNet）以提高推理的实时性和分割的准确性。首先，在压缩提炼（SR）单元中引入一维（1D）膨胀卷积和类瓶颈结构单元，从而极大地减少模型的计算量和参数量；其次，引入多尺度空间注意（SA）混合模块，从而高效地利用浅层特征的空间信息；最后，通过堆叠SR单元构成编码器，并采用两块SA单元在编码器的尾部构成解码器。实验仿真表明，SRNet在仅有30 MB参数量及8.8×10⁹每秒浮点操作数（FLOPS）的情况下，仍可在Cityscapes数据集上获得68.3%的平均交并比（MIoU）。此外，所提模型在单块NVIDIA Titan RTX卡上实现了12.6 帧每秒（FPS）的前向推理速度（输入像素的大小为512×1 024×3）。实验结果表明，所设计的轻量级模型SRNet很好地在准确分割和实时推理间取得平衡，适用于算力及功耗有限的场合。

关键词: 语义分割, 轻量级网络, 实时推理, 空间注意混合模块, 一维膨胀卷积

CLC Number:

TP391.4

Juan WANG, Xuliang YUAN, Minghu WU, Liquan GUO, Zishan LIU. Real-time semantic segmentation method based on squeezing and refining network[J]. Journal of Computer Applications, 2022, 42(7): 1993-2000.

王娟, 袁旭亮, 武明虎, 郭力权, 刘子杉. 基于压缩提炼网络的实时语义分割方法[J]. 《计算机应用》唯一官方网站, 2022, 42(7): 1993-2000.

Figures/Tables 11

References 27

1	董阳，潘海为，崔倩娜，等. 面向多模态磁共振脑瘤图像的小样本分割方法［J］. 计算机应用， 2021， 41（4）： 1049-1054. 10.11772/j.issn.1001-9081.2020081388
	DONG Y， PAN H W， CUI Q N， et al. Few-shot segmentation method for multi-modal magnetic resonance images of brain tumor［J］. Journal of Computer Applications， 2021， 41（4）： 1049-1054. 10.11772/j.issn.1001-9081.2020081388
2	佘玉龙，张晓龙，程若勤，等. 基于边缘关注模型的语义分割方法［J］. 计算机应用. 2021， 41（2）： 343-349. 10.11772/j.issn.1001-9081.2020050725
	SHE Y L， ZHANG X L， CHENG R Q， et al. Semantic segmentation method based on edge attention model［J］. Journal of Computer Applications， 2021， 41（2）： 343-349. 10.11772/j.issn.1001-9081.2020050725
3	高海军，曾祥银，潘大志，等. 基于U-Net改进模型的直肠肿瘤分割方法［J］. 计算机应用， 2020， 40（8）： 2392-2397. 10.11772/j.issn.1001-9081.2020030318
	GAO H Q， ZENG X Y， PAN D Z， et al. Rectal tumor segmentation method based on improved U-Net model［J］. Journal of Computer Applications， 2020， 40（8）： 2392-2397. 10.11772/j.issn.1001-9081.2020030318
4	BADRINARAYANAN V， KENDALL A， CIPOLLA R. SegNet： a deep convolutional encoder-decoder architecture for image segmentation［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2017， 39（12）： 2481-2495. 10.1109/tpami.2016.2644615
5	PASZKE A， CHAURASIA A， KIM S， et al. ENet： a deep neural network architecture for real-time semantic segmentation［EB/OL］. （2016-06-07）［2021-04-10］..
6	RONNEBERGER O， FISCHER P， BROX T. U-Net： convolutional networks for biomedical［C］// Proceedings of the 2015 International Conference on Medical Image Computing and Computer-Assisted Intervention， LNCS 9351. Cham： Springer， 2015： 234-241.
7	TREML M， ARJONA-MEDINA J， UNTERTHINER T， et al. Speeding up semantic segmentation for autonomous driving［EB/OL］. ［2021-04-17］..
8	ROMERA E， ÁLVAREZ J M， BERGASA L M， et al. ERFNet efficient residual factorized ConvNet for real-time semantic segmentation［J］. IEEE Transactions on Intelligent Transportation Systems， 2017， 19（1）： 263-272. 10.1109/tits.2017.2750080
9	YU C Q， WANG J B， PENG C， et al. BiSeNet： bilateral segmentation network for real-time semantic segmentation［C］// Proceedings of the 2018 European Conference on Computer Vision， LNCS 11217. Cham： Springer， 2018： 334-349.
10	ZHAO H S， QI X J， SHEN X Y， et al. ICNet for real-time semantic segmentation on high-resolution images［C］// Proceedings of the 2018 European Conference on Computer Vision， LNCS 11207. Cham： Springer， 2018： 418-434.
11	WANG Y， ZHOU Q， LIU J， et al. LEDNet： a lightweight encoder-decoder network for real-time semantic segmentation［C］// Proceedings of the 2019 IEEE International Conference on Image Processing. Piscataway： IEEE， 2019： 1860-1864. 10.1109/icip.2019.8803154
12	GAMAL M， SIAM M， ABDEL-RAZEK M. ShuffleSeg： real-time semantic segmentation network［EB/OL］. （2018-03-15）［2021-04-11］..
13	LI G， KIM J. DABNet： depth-wise asymmetric bottleneck for real-time semantic segmentation［C］// Proceedings of the 2019 British Machine Vision Conference. Durham： BMVA Press， 2019： No.259. 10.1109/access.2020.2971760
14	ZHAO H S， SHI J Q， QI X J， et al. Pyramid scene parsing network［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 6230-6239. 10.1109/cvpr.2017.660
15	CHEN L C， PAPANDREOU G， KOKKINOS I， et al. DeepLab： semantic image segmentation with deep convolutional nets， atrous convolution， and fully connected CRFs［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2018， 4（40）： 834-848.
16	LONG J， SHELHAMER E， DARRELL T. Fully convolutional networks for semantic segmentation［C］// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2015： 3431-3440. 10.1109/cvpr.2015.7298965
17	CHAURASIA A， CULURCIELLO E. LinkNet： exploiting encoder representations for efficient semantic segmentation［C］// Proceedings of the 2017 IEEE Visual Communications and Image Processing. Piscataway： IEEE， 2017： 1-4. 10.1109/vcip.2017.8305148
18	CHEN P R， HANG H M， CHAN S W， et al. DSNet： an efficient CNN for road scene segmentation［C］// Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference. Piscataway： IEEE， 2019： 424-432. 10.1109/apsipaasc47483.2019.9023104
19	PENG C， ZHANG X Y， YU G， et al. Large kernel matters — improve semantic segmentation by global convolutional network［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 1743-1751. 10.1109/cvpr.2017.189
20	WANG P Q， CHEN P F， YUAN Y， et al. Understanding convolution for semantic segmentation［C］// Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision. Piscataway： IEEE， 2018： 1451-1460. 10.1109/wacv.2018.00163
21	POUDEL R P K， LIWICKI S， CIPOLLA R. Fast-SCNN： fast semantic segmentation network［C］// Proceedings of the 2019 British Machine Vision Conference. Durham： BMVA Press， 2019： No.289.
22	ZHAO H Y， ZHANG Y， LIU S， et al. PSANet： point-wise spatial attention network for scene parsing［C］// Proceedings of the 2018 European Conference on Computer Vision， LNCS 11213. Cham： Springer， 2018： 270-286.
23	YU C Q， WANG J B， PENG C， et al. Learning a discriminative feature network for semantic segmentation［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 1857-1866. 10.1109/cvpr.2018.00199
24	HUANG Z L， WANG X G， HUANG L C， et al. CCNet： criss-cross attention for semantic segmentation［C］// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2019： 603-612. 10.1109/iccv.2019.00069
25	WANG X L， GIRSHICK R， GUPTA A， et al. Non-local neural networks［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 7794-7803. 10.1109/cvpr.2018.00813
26	CHEN L C， YANG Y， WANG J， et al. Attention to scale： scale-aware semantic image segmentation［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 3640-3649. 10.1109/cvpr.2016.396
27	LIN G S， MILAN A， SHEN C H， et al. RefineNet： multi-path refinement networks for high-resolution semantic segmentation［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 5168-5177. 10.1109/cvpr.2017.549

算法	320×480×3		640×960×3		512×1 024×3
算法	每帧推理时间/ms	推理速度/ FPS	每帧推理时间/ms	推理速度/ FPS	每帧推理时间/ms	推理速度/ FPS
ENet^［5］	39	25.60	124	8.0	101	9.9
SegNet^［4］	160	6.25	515	1.9	457	1.9
SQNet^［7］	23	43.40	60	16.6	52	19.2
ERFNet^［8］	37	27.00	76	13.1	67	14.9
本文算法	57	17.50	85	11.7	79	12.6

算法	320×480×3		640×960×3		512×1 024×3
算法	每帧推理时间/ms	推理速度/ FPS	每帧推理时间/ms	推理速度/ FPS	每帧推理时间/ms	推理速度/ FPS
ENet^［5］	39	25.60	124	8.0	101	9.9
SegNet^［4］	160	6.25	515	1.9	457	1.9
SQNet^［7］	23	43.40	60	16.6	52	19.2
ERFNet^［8］	37	27.00	76	13.1	67	14.9
本文算法	57	17.50	85	11.7	79	12.6

算法	FLOPS	参数量/MB	MIoU∕％
ENet^［5］	8.1×10⁹	12	58.3
SegNet^［4］	6.4×10¹¹	870	57.0
SQNet^［6］	6.1×10¹⁰	76	59.8
ERFNet^［7］	5.4×10¹⁰	67	68.0
本文算法	8.8×10⁹	30	68.3

算法	FLOPS	参数量/MB	MIoU∕％
ENet^［5］	8.1×10⁹	12	58.3
SegNet^［4］	6.4×10¹¹	870	57.0
SQNet^［6］	6.1×10¹⁰	76	59.8
ERFNet^［7］	5.4×10¹⁰	67	68.0
本文算法	8.8×10⁹	30	68.3

算法	类别预测准确率																			MIoU
算法	马路	人行道	建筑	墙壁	围栏	杆子	路灯	信号灯	植物	地面	天空	行人	骑手	汽车	卡车	巴士	火车	摩托车	单车	MIoU
ENet^［5］	96.3	74.2	75.0	32.2	33.2	43.4	34.1	44.0	88.6	61.4	90.6	65.5	38.4	90.6	36.9	50.5	48.1	38.8	55.4	58.3
SegNet^［4］	96.4	73.2	84.0	28.4	29.0	35.7	39.8	45.1	87.0	63.8	91.8	62.8	42.8	89.3	38.1	43.1	44.1	35.8	51.9	56.1
SQNet^［7］	96.9	75.4	87.9	31.6	35.7	50.9	52.0	61.7	90.9	65.8	93.0	73.8	42.6	91.5	18.8	41.2	33.3	34.0	59.9	59.8
ERFNet^［8］	97.2	80.0	89.5	41.6	45.3	56.4	60.5	64.6	91.4	68.7	94.2	65.5	38.4	90.6	36.9	50.5	48.1	38.8	55.4	68.0
本文算法	94.0	80.8	88.7	60.8	60.7	67.0	58.2	66.5	91.8	67.3	93.0	44.0	51.3	82.2	55.3	61.5	52.0	51.7	70.8	68.3

Real-time semantic segmentation method based on squeezing and refining network

基于压缩提炼网络的实时语义分割方法

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 11

References 27

Related Articles 15

Recommended Articles

Metrics

[1]	Wei LI, Ling CHEN, Xiuyuan XU, Min ZHU, Jixiang GUO, Kai ZHOU, Hao NIU, Yuchen ZHANG, Shanye YI, Yi ZHANG, Fengming LUO. Interstitial lung disease segmentation algorithm based on multi-task learning [J]. Journal of Computer Applications, 2024, 44(4): 1285-1293.
[2]	Pengfei ZHANG, Litao HAN, Hengjian FENG, Hongmei LI. Point cloud semantic segmentation based on attention mechanism and global feature optimization [J]. Journal of Computer Applications, 2024, 44(4): 1086-1092.
[3]	Boyue WANG, Yingxiang LI, Jiandan ZHONG. Segmentation network for day and night ground-based cloud images based on improved Res-UNet [J]. Journal of Computer Applications, 2024, 44(4): 1310-1316.
[4]	Bin XIAO, Yun GAN, Min WANG, Xingpeng ZHANG, Zhaoxing WANG. Network abnormal traffic detection based on port attention and convolutional block attention module [J]. Journal of Computer Applications, 2024, 44(4): 1027-1034.
[5]	Ning WU, Yangyang LUO, Huajie XU. Semantic segmentation method for remote sensing images based on multi-scale feature fusion [J]. Journal of Computer Applications, 2024, 44(3): 737-744.
[6]	Yongjiang LIU, Bin CHEN. Pixel-level unsupervised industrial anomaly detection based on multi-scale memory bank [J]. Journal of Computer Applications, 2024, 44(11): 3587-3594.
[7]	Ziyi LI, Tingting QU, Qianpeng CHONG, Jindong XU. Remote sensing image segmentation network based on fuzzy multiscale features [J]. Journal of Computer Applications, 2024, 44(11): 3581-3586.
[8]	Qiumei ZHENG, Weiwei NIU, Fenghua WANG, Dan ZHAO. Dual-branch real-time semantic segmentation network based on detail enhancement [J]. Journal of Computer Applications, 2024, 44(10): 3058-3066.
[9]	Di ZHOU, Zili ZHANG, Jia CHEN, Xinrong HU, Ruhan HE, Jun ZHANG. Stomach cancer image segmentation method based on EfficientNetV2 and object-contextual representation [J]. Journal of Computer Applications, 2023, 43(9): 2955-2962.
[10]	Hong WANG, Qing QIAN, Huan WANG, Yong LONG. Lightweight image tamper localization algorithm based on large kernel attention convolution [J]. Journal of Computer Applications, 2023, 43(9): 2692-2699.
[11]	Shuai ZHENG, Xiaolong ZHANG, He DENG, Hongwei REN. 3D liver image segmentation method based on multi-scale feature fusion and grid attention mechanism [J]. Journal of Computer Applications, 2023, 43(7): 2303-2310.
[12]	Bin LU, Jielin LIU. Semantic segmentation for 3D point clouds based on feature enhancement [J]. Journal of Computer Applications, 2023, 43(6): 1818-1825.
[13]	Quan YUAN, Yunpeng XU, Chengliang TANG. Document-level relation extraction method based on path labels [J]. Journal of Computer Applications, 2023, 43(4): 1029-1035.
[14]	Xuedong HE, Shibin XUAN, Kuan WANG, Mengnan CHEN. DeepLabV3+ image segmentation algorithm fusing cumulative distribution function and channel attention mechanism [J]. Journal of Computer Applications, 2023, 43(3): 936-942.
[15]	Feiyu LIAN, Liang ZHANG, Jiedong WANG, Yukang JIN, Yu CHAI. Outdoor scene point cloud segmentation model based on graph model and attention mechanism [J]. Journal of Computer Applications, 2023, 43(12): 3911-3917.

模块	速度/FPS	MIoU/%
无SA模块	12.9	67.4
无膨胀率	13.3	67.1
总体模型	12.6	68.3

模块	速度/FPS	MIoU/%
无SA模块	12.9	67.4
无膨胀率	13.3	67.1
总体模型	12.6	68.3