Journal of Computer Applications ›› 2022, Vol. 42 ›› Issue (1): 230-238.DOI: 10.11772/j.issn.1001-9081.2021010137

• Multimedia computing and computer simulation • Previous Articles    

Image segmentation algorithm with adaptive attention mechanism based on Deeplab V3 Plus

Zhen YANG, Xiaobao PENG(), Qiangqiang ZHU, Zhijian YIN   

  1. College of Communication and Electronics,Jiangxi Science and Technology Normal University,Nanchang Jiangxi 330013,China
  • Received:2021-01-25 Revised:2021-04-22 Accepted:2021-05-10 Online:2021-06-04 Published:2022-01-10
  • Contact: Xiaobao PENG
  • About author:YANG Zhen, born in 1985, Ph. D., lecturer. His research interests include object detection, image segmentation.
    PENG Xiaobao, born in 1995, M. S. candidate. His research interests include image segmentation.
    ZHU Qiangqiang, born in 1995, M. S. candidate. His research interests include object detection and recognition.
    YIN Zhijian, born in 1968, M. S., professor. His research interests include object recognition and tracking.
  • Supported by:
    National Natural Science Foundation of China(61866016);Surface Program of Jiangxi Natural Science Foundation(20202BABL202014);Outstanding Youth Project of Jiangxi Science and Technology Normal University(2018QNBJRC002)

基于Deeplab V3 Plus的自适应注意力机制图像分割算法

杨贞, 彭小宝(), 朱强强, 殷志坚   

  1. 江西科技师范大学 通信与电子学院,南昌 330013
  • 通讯作者: 彭小宝
  • 作者简介:杨贞(1985—),男,山东菏泽人,讲师,博士,CCF会员,主要研究方向:目标检测、图像分割
    彭小宝(1995—),男,江西新余人,硕士研究生,主要研究方向:图像分割
    朱强强(1995—),男,安徽阜阳人,硕士研究生,主要研究方向:目标检测与识别
    殷志坚(1968—),男,江西南昌人,教授,硕士,主要研究方向:目标识别与跟踪。

Abstract:

In order to solve the problem that image details and small target information are lost prematurely in the subsampling operations of Deeplab V3 Plus, an adaptive attention mechanism image semantic segmentation algorithm based on Deeplab V3 Plus network architecture was proposed. Firstly, attention mechanism modules were embedded in the input layer, middle layer and output layer of Deeplab V3 Plus backbone network, and a weight value was introduced to be multiplied with each attention mechanism module to achieve the purpose of constraining the attention mechanism modules. Secondly, the Deeplab V3 Plus embedded with the attention modules was trained on the PASCAL VOC2012 common segmentation dataset to obtain the weight values (empirical values) of the attention mechanism modules manually. Then, various fusion methods of attention mechanism modules in the input layer, the middle layer and the output layer were explored. Finally, the weight value of the attention mechanism module was automatically updated by back propagation and the optimal weight value and optimal segmentation model of the attention mechanism module were obtained. Experimental results show that, compared with the original Deeplab V3 Plus network structure, the Deeplab V3 Plus network structure with adaptive attention mechanism has the Mean Intersection over Union (MIOU) increased by 1.4 percentage points and 0.7 percentage points on the PASCAL VOC2012 common segmentation dataset and the plant pest dataset, respectively.

Key words: semantic segmentation, subsampling operation, adaptive attention mechanism, weight value of attention mechanism module, Deeplab V3 Plus

摘要:

针对Deeplab V3 Plus在下采样操作中图像细节信息和小目标信息过早丢失的问题,提出了一种基于Deeplab V3 Plus网络架构的自适应注意力机制图像语义分割算法。首先,在Deeplab V3 Plus主干网络的输入层、中间层和输出层均嵌入注意力机制模块,并且引入一个权重值与每个注意力机制模块相乘,以达到约束注意力机制模块的目的;其次,在PASCAL VOC2012 公共分割数据集上训练嵌入注意力模块的Deeplab V3 Plus,以此手动获取注意力机制模块权重值(经验值);然后,探索输入层、中间层和输出层中注意力机制模块的多种融合方式;最后,将注意力机制模块的权重值更改为反向传播自动更新,从而得到注意力机制模块的最优权值和最优分割模型。实验结果表明,与原始Deeplab V3 Plus网络结构相比,引入自适应注意力机制的Deeplab V3 Plus网络结构在PASCAL VOC2012公共分割据集和植物虫害数据集上的平均交并比(MIOU)分别提高了1.4个百分点和0.7个百分点。

关键词: 语义分割, 下采样操作, 自适应注意力机制, 注意力机制模块权重值, Deeplab V3 Plus

CLC Number: