Journal of Computer Applications ›› 2022, Vol. 42 ›› Issue (7): 1993-2000.DOI: 10.11772/j.issn.1001-9081.2021050812

• Artificial intelligence • Previous Articles    

Real-time semantic segmentation method based on squeezing and refining network

Juan WANG1,2,3, Xuliang YUAN1, Minghu WU1,2,3(), Liquan GUO1, Zishan LIU1   

  1. 1.School of Electrical and Electronic Engineering,Hubei University of Technology,Wuhan Hubei 430068,China
    2.Hubei Key Laboratory for High?efficiency Utilization of Solar Energy and Operation Control of Energy Storage System (Hubei University of Technology),Wuhan Hubei 430068,China
    3.Postdoctoral Mobile Research Station of Hua’an Technology Company Limited,Wuhan Hubei 430068,China
  • Received:2021-05-18 Revised:2021-09-22 Accepted:2021-09-24 Online:2022-07-15 Published:2022-07-10
  • Contact: Minghu WU
  • About author:WANG Juan, born in 1983, Ph. D., associate professor. Her research interests include artificial intelligence, computer vision, deep learning.
    YUAN Xuliang, born in 1992, M. S. candidate. His research interests include artificial intelligence, computer vision, image segmentation.
    GUO Liquan, born in 1997, M. S. candidate. His research interests include artificial intelligence, machine vision.
    LIU Zishan, born in 1997, M. S. candidate. Her research interests include artificial intelligence, machine vision.
  • Supported by:
    National Natural Science Foundation of China(62006073)

基于压缩提炼网络的实时语义分割方法

王娟1,2,3, 袁旭亮1, 武明虎1,2,3(), 郭力权1, 刘子杉1   

  1. 1.湖北工业大学 电气与电子工程学院, 武汉 430068
    2.太阳能高效利用及储能运行控制湖北省重点实验室(湖北工业大学), 武汉 430068
    3.武汉华安科技有股份限公司博士后工作站, 武汉 430068
  • 通讯作者: 武明虎
  • 作者简介:王娟(1983—),女,湖北武汉人,副教授,博士,主要研究方向:人工智能、计算机视觉、深度学习
    袁旭亮(1992—),男,广西河池人,硕士研究生,主要研究方向:人工智能、机器视觉、图像分割
    郭力权(1997—),男,湖北黄冈人,硕士研究生,主要研究方向:人工智能、机器视觉
    刘子杉(1997—),女,湖北武汉人,硕士研究生,主要研究方向:人工智能、机器视觉。
  • 基金资助:
    国家自然科学基金资助项目(62006073)

Abstract:

Aiming at the problem that the current semantic segmentation algorithms are difficult to reach the balance between real-time reasoning and high-precision segmentation, a Squeezing and Refining Network (SRNet) was proposed to improve real-time performance of reasoning and accuracy of segmentation. Firstly, One-Dimensional (1D) dilated convolution and bottleneck-like structure unit were introduced into Squeezing and Refining (SR) unit, which greatly reduced the amount of calculation and the number of parameters of model. Secondly, the multi-scale Spatial Attention (SA) confusing module was introduced to make use of the spatial information of shallow layer features efficiently. Finally, the encoder was formed through stacking SR units, and two SA units were used to form the decoder. Simulation shows that SRNet obtains 68.3% Mean Intersection over Union (MIoU) on Cityscapes dataset with only 30 MB parameters and 8.8×109 FLoating-point Operation Per Second (FLOPS). Besides, the model reaches a forward reasoning speed of 12.6 Frames Per Second (FPS) with input pixel size of 512×1 024×3 on a single NVIDIA Titan RTX card. Experimental results imply that the designed lightweight model SRNet reaches a good balance between accurate segmentation and real-time reasoning, and is suitable for scenarios with limited computing power and power consumption.

Key words: semantic segmentation, lightweight network, real-time reasoning, Spatial Attention (SA) confusing module, one-dimensional dilated convolution

摘要:

针对目前语义分割算法难以取得实时推理和高精度分割间平衡的问题,提出压缩提炼网络(SRNet)以提高推理的实时性和分割的准确性。首先,在压缩提炼(SR)单元中引入一维(1D)膨胀卷积和类瓶颈结构单元,从而极大地减少模型的计算量和参数量;其次,引入多尺度空间注意(SA)混合模块,从而高效地利用浅层特征的空间信息;最后,通过堆叠SR单元构成编码器,并采用两块SA单元在编码器的尾部构成解码器。实验仿真表明,SRNet在仅有30 MB参数量及8.8×109每秒浮点操作数(FLOPS)的情况下,仍可在Cityscapes数据集上获得68.3%的平均交并比(MIoU)。此外,所提模型在单块NVIDIA Titan RTX卡上实现了12.6 帧每秒(FPS)的前向推理速度(输入像素的大小为512×1 024×3)。实验结果表明,所设计的轻量级模型SRNet很好地在准确分割和实时推理间取得平衡,适用于算力及功耗有限的场合。

关键词: 语义分割, 轻量级网络, 实时推理, 空间注意混合模块, 一维膨胀卷积

CLC Number: