《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (9): 2659-2666.DOI: 10.11772/j.issn.1001-9081.2021071327

• 人工智能 • 上一篇    

基于注意力机制和有效分解卷积的实时分割算法

文凯1,2, 唐伟伟1,2(), 熊俊臣1,2   

  1. 1.重庆邮电大学 通信与信息工程学院,重庆 400065
    2.重庆邮电大学 通信新技术应用研究中心,重庆 400065
  • 收稿日期:2021-07-23 修回日期:2021-10-14 接受日期:2021-10-18 发布日期:2021-10-29 出版日期:2022-09-10
  • 通讯作者: 唐伟伟
  • 作者简介:文凯(1972—),男,重庆人,高级工程师,博士,主要研究方向:大数据、计算机视觉、移动通信;
    熊俊臣(1997—),男,四川达州人,硕士研究生,主要研究方向:图像语义分割、图像处理。

Real-time segmentation algorithm based on attention mechanism and effective factorized convolution

Kai WEN1,2, Weiwei TANG1,2(), Junchen XIONG1,2   

  1. 1.School of Communications and Information Engineering,Chongqing University of Posts and Telecommunications,Chongqing 400065,China
    2.Research Center of New Communication Technology Application,Chongqing University of Posts and Telecommunications,Chongqing 400065,China
  • Received:2021-07-23 Revised:2021-10-14 Accepted:2021-10-18 Online:2021-10-29 Published:2022-09-10
  • Contact: Weiwei TANG
  • About author:WEN Kai, born in 1972, Ph. D., senior engineer. His research interests include big data, computer vision, mobile communication.
    XIONG Junchen, born in 1997, M. S. candidate, His research interests include semantic segmentation of images, image processing.

摘要:

针对现阶段实时语义分割算法计算成本高和内存占用大而无法满足实际场景需求的问题,提出一种新型的浅层的轻量级实时语义分割算法——基于注意力机制和有效分解卷积的实时分割算法(AEFNet)。首先,利用一维非瓶颈结构(Non-bottleneck-1D)构建轻量级分解卷积模块以提取丰富的上下文信息并减少运算量,同时以一种简单的方式增强算法学习能力并利于提取细节信息;然后,结合池化操作和注意力细化模块(ARM)构建全局上下文注意力模块以捕捉全局信息并细化算法的每个阶段,从而优化分割效果。算法在公共数据集cityscapes和camvid上进行验证,并在cityscapes测试集上获得精度为74.0%和推理速度为118.9帧速率(FPS),相比深度非对称瓶颈网络(DABNet),所提算法在精度上提高了约4个百分点,推理速度提升了14.7 FPS,与最近高效的增强非对称卷积网络(EACNet)相比,所提算法精度略低0.2个百分点,然而推理速度提高了6.9 FPS。实验结果表明:所提算法能够较为准确地识别场景信息,并能满足实时性要求。

关键词: 分解卷积, 注意力机制, 空间细节信息, 上下文信息, 轻量级算法

Abstract:

The current real-time semantic segmentation algorithm has the high computational cost and large memory footprint, which cannot meet the applications requirements of actual scenes. In order to solve the problems, a new type of shallow lightweight real-time semantic segmentation algorithm — AEFNet (Real-time segmentation algorithm based on Attention mechanism and Effective Factorized convolution) was proposed. Firstly, one-dimensional non-bottleneck structure (Non-bottleneck-1D) was adopted to construct a lightweight factorized convolution module to extract rich contextual information and reduce the amount of calculation. At the same time, the learning ability of the algorithm was enhanced in a simple way and the extraction of detailed information was facilitated. Then, the pooling operation and Attention Refinement Module (ARM) were combined to construct a global context attention module to capture global information and refine each stage of the algorithm to optimize the segmentation effect. The algorithm was verified on the public datasets cityscapes and camvid, and the precision of 74.0% and the inference speed of 118.9 Frames Per Second (FPS) were obtained on the cityscapes test set. Compared with Depth-wise Asymmetric Bottleneck Network (DABNet), the proposed algorithm has the precision increased by about 4 percentage points, and the inference speed increased by 14.7 FPS. Compared with the recent efficient Enhanced Asymmetric Convolution Network (EACNet), the proposed algorithm has the precision slightly lower by 0.2 percentage points, but has the inference speed increased by 6.9 FPS. Experimental results show that the proposed algorithm can more accurately identify the scene information, and can meet the real-time requirements.

Key words: factorized convolution, attention mechanism, spatial detailed information, contextual information, lightweight algorithm

中图分类号: