Journal of Computer Applications ›› 2026, Vol. 46 ›› Issue (5): 1408-1415.DOI: 10.11772/j.issn.1001-9081.2025050595

• Artificial intelligence • Previous Articles    

Improved DeepLabV3+ method based on adaptive attention and nested receptive field

Changzheng XING, Xin ZHENG(), Di JIA, Junfeng LIANG   

  1. School of Electronics and Information Engineering,Liaoning Technical University,Huludao Liaoning 125105,China
  • Received:2025-06-03 Revised:2025-08-29 Accepted:2025-09-09 Online:2025-09-15 Published:2026-05-10
  • Contact: Xin ZHENG
  • About author:XING Changzheng, born in 1967, Ph. D.,professor. His research interests include artificial intelligence, information processing.
    JIA Di, born in 1982, Ph. D., professor. His research interests include stereo matching and 3D reconstruction, photogrammetry, visual spatial positioning, visual robotic arm operations.
    First author contact:LIANG Jungfeng, born in 2000, M. S. candidate. His research interests include data mining,reinforcement learning.
  • Supported by:
    National Key Research and Development Program of China(2018YFB402900);Key Project of Educational Department of Liaoning Province(LJ212410147003)

基于自适应注意力与嵌套感受野改进DeepLabV3+方法

邢长征, 郑鑫(), 贾迪, 梁浚锋   

  1. 辽宁工程技术大学 电子与信息工程学院,辽宁 葫芦岛 125105
  • 通讯作者: 郑鑫
  • 作者简介:邢长征(1967—),男,辽宁阜新人,教授,博士,CCF会员,主要研究方向:人工智能、信息处理
    贾迪(1982—),男,河北邢台人,教授,博士,主要研究方向:立体匹配与三维重建、摄影测量、视觉空间定位、视觉机械臂作业
    梁浚锋(2000—),男,河南焦作人,硕士研究生,主要研究方向:数据挖掘、强化学习。
  • 基金资助:
    国家重点研发计划项目(2018YFB402900);辽宁省教育厅重点项目(LJ212410147003)

Abstract:

To address the problems of high complexity and low segmentation accuracy for certain classes in DeepLabV3+ caused by atrous convolutions with different dilation rates, an improved method that integrates Evolutionary Nested Receptive Field (ENRF) module with Adaptive Class-Channel Attention (ACCA) mechanism was proposed. In this method, the original Atrous Spatial Pyramid Pooling (ASPP) module was replaced by ENRF module, and ACCA mechanism was incorporated into the fused features, enabling continuous expansion of receptive field and more fine-grained feature representation, and reducing the number of parameters and computational overhead to enhance the model’s efficiency and lightweightness. Firstly, ACCA mechanism was constructed by combining channel-adaptive and class-adaptive attention mechanisms, which exploited inter-channel and inter-class feature dependencies to strengthen the representation of critical information in feature maps. Secondly, ENRF module was designed by introducing convolution kernels of different sizes and dilation rates, forming a nested evolutionary receptive field structure that gradually enlarged the receptive field to capture multi-scale contextual information and fine-grained boundary details. The improved method was compared with Fully Convolutional Network with 8s skip connections (FCN8s), Pyramid Scene Parsing Network (PSPNet), Unified Perceptual parsing Network (UPerNet), Bilateral Segmentation Network Version 2 (BiSeNet V2), Deep Feature Aggregation Network (DFANet), and the original DeepLabV3+ in terms of FLOPs (FLoating-point OPerations), parameter count, mean Intersection over Union (mIoU), inference speed, and memory usage. Experimental results show that the improved DeepLabV3+ reduces parameters and FLOPs, accelerates inference, and improves segmentation performance.

Key words: nested evolution, lightweighting, feature dependency, dilation rate, DeepLabV3+

摘要:

针对DeepLabV3+模型因使用不同膨胀率空洞卷积导致复杂度高及部分类别分割精度低的问题,提出一种融合进化式嵌套感受野(ENRF)模块与自适应类别通道注意力(ACCA)机制的改进方法。该方法将原有空洞空间卷积池化金字塔(ASPP)模块替换为ENRF模块,并在融合特征中引入ACCA机制,实现了感受野的连续拓展与更精细化的特征表达,同时降低了参数量和计算开销,提升了模型的轻量化水平。首先,ACCA机制通过融合通道自适应注意力与类别自适应2种注意力机制,挖掘通道间和类别间的特征依赖关系,提升特征图中关键信息的表达能力;其次,ENRF模块引入不同大小和不同膨胀率的卷积核,构建了一种基于嵌套感受野演化的网络结构,以扩大特征图的感受野,捕捉多尺度的上下文信息及细粒度的边缘特征。与全卷积网络(FCN8s)、金字塔场景解析网络(PSPNet)、统一感知解析网络(UPerNet)、双向分割网络(BiSeNet V2)、深度特征聚合网络(DFANet)以及原始DeepLabV3+在浮点运算次数(FLOPs)、参数量、均值交并比(mIoU)、推理速度和内存占用5个指标上进行对比实验的结果表明,改进后的DeepLabV3+方法在减少参数量和FLOPs的同时,也提高了推理速度并改善了图像分割性能。

关键词: 嵌套演化, 轻量化, 特征依赖, 膨胀率, DeepLabV3+

CLC Number: