计算机应用 ›› 2021, Vol. 41 ›› Issue (5): 1326-1331.DOI: 10.11772/j.issn.1001-9081.2020081181

所属专题: 人工智能

• 人工智能 • 上一篇    下一篇

基于深度学习的轻量级道路图像语义分割算法

胡嵽, 冯子亮   

  1. 四川大学 计算机学院, 成都 610065
  • 收稿日期:2020-08-06 修回日期:2020-10-26 出版日期:2021-05-10 发布日期:2020-12-09
  • 通讯作者: 冯子亮
  • 作者简介:胡嵽(1997-),女,贵州毕节人,硕士研究生,主要研究方向:图像处理、计算机视觉;冯子亮(1970-),男,四川南充人,研究员,博士,主要研究方向:空管应用系统、图像处理。
  • 基金资助:
    国家自然科学基金委和民航局联合基金资助项目(U1833115)。

Light-weight road image semantic segmentation algorithm based on deep learning

HU Die, FENG Ziliang   

  1. College of Computer Science, Sichuan University. Chengdu Sichuan 610065, China
  • Received:2020-08-06 Revised:2020-10-26 Online:2021-05-10 Published:2020-12-09
  • Supported by:
    This work is partially supported by the Joint Funds of National Natural Science Foundation of China and Civil Aviation Administration of China (U1833115).

摘要: 针对深度学习中道路图像语义分割模型参数量巨大以及计算复杂,不适合于部署在移动端进行实时分割的问题,提出了一种使用深度可分离卷积构建的轻量级对称U型编码器-解码器式的图像语义分割网络MUNet。首先设计出U型编码器-解码器式网络;其次,在卷积块之间设计稀疏短连接;最后,引入了注意力机制与组归一化(GN)方法,从而在减少模型参数量以及计算量的同时提升分割精度。针对道路图像CamVid数据集,在1 000轮训练后,MUNet模型分割结果在测试图像裁剪为720×720大小时的平均交并比(MIoU)为61.92%。实验结果表明,和常见的图像语义分割网络如金字塔场景分析网络(PSPNet)、RefineNet、全局卷积网络(GCN)和DeepLabv3+相比较,MUNet的参数量以及计算量更少,同时网络分割性能更好。

关键词: 深度学习, 道路图像语义分割, 深度可分离卷积, 轻量级神经网络, 注意力机制

Abstract: In order to solve the problem that the road image semantic segmentation model has huge parameter number and complex calculation in deep learning, and is not suitable for deployment on mobile terminals for real-time segmentation, a light-weighted symmetric U-shaped encoder-decoder image semantic segmentation network constructed by depthwise separable convolution was introduced, namely MUNet. First, a U-shaped encoder-decoder network was designed; then, the sparse short connection design was added in the convolution blocks; at last, the attention mechanism and Group Normalization (GN) method were introduced to reduce the amount of model parameters and calculation while improving the segmentation accuracy. For the CamVid dataset of road images, after 1 000 rounds of training, the Mean Intersection over Union (MIoU) of the segmentation results of the MUNet was 61.92% when the test image was cropped to a size of 720×720. Experimental results show that compared with the common image semantic segmentation networks such as Pyramid Scene Parsing Network (PSPNet), RefineNet, Global Convolutional Network (GCN) and DeepLabv3+, MUNet has fewer parameters and calculation with better network segmentation performance.

Key words: deep learning, road image semantic segmentation, depthwise separable convolution, light-weighted neural network, attention mechanism

中图分类号: