《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (9): 2900-2908.DOI: 10.11772/j.issn.1001-9081.2021071136

• 多媒体计算与计算机仿真 • 上一篇    

嵌入注意力机制的轻量级钢筋检测网络

李姚舜, 刘黎志()   

  1. 智能机器人湖北省重点实验室(武汉工程大学),武汉 430205
  • 收稿日期:2021-07-01 修回日期:2021-09-13 接受日期:2021-09-15 发布日期:2021-09-22 出版日期:2022-09-10
  • 通讯作者: 刘黎志
  • 作者简介:李姚舜(1998—),男,湖北荆州人,硕士研究生,主要研究方向:深度学习、目标检测;
  • 基金资助:
    湖北省教育厅科学研究计划指导性项目(B2017051);智能机器人湖北省重点实验室开放基金资助项目(HBIRL202002)

Lightweight network for rebar detection with attention mechanism

Yaoshun LI, Lizhi LIU()   

  1. Hubei Key Laboratory of Intelligent Robot (Wuhan Institute of Technology),Wuhan Hubei 430205,China
  • Received:2021-07-01 Revised:2021-09-13 Accepted:2021-09-15 Online:2021-09-22 Published:2022-09-10
  • Contact: Lizhi LIU
  • About author:LI Yaoshun, born in 1998, M. S. candidate. His research interests include deep learning, object detection.
  • Supported by:
    Guidance Project of Scientific Research Plan of Hubei Provincial Department of Education(B2017051);Open Fund of Hubei Key Laboratory of Intelligent Robot(HBIRL202002)

摘要:

智慧工地中的设备内存和计算能力有限,在现场的设备上通过目标检测对钢筋进行实时检测具有很大的难度,而且其钢筋检测速度慢、模型部署成本高。针对这些问题,在YOLOv3网络的基础上,提出了一个嵌入注意力机制的轻量级钢筋检测网络RebarNet。首先,利用残差块作为网络的基本单元来构建特征提取结构,并用其提取局部和上下文信息;其次,在残差块中添加通道注意力(CA)模块和空间注意力(SA)模块,以调整特征图的注意力权重,并提升网络提取特征的能力;然后,采用特征金字塔融合模块,以增大网络的感受野,并优化中等钢筋图像的提取效果;最后,输出经过8倍下采样后的52×52通道的特征图用于后处理和钢筋检测。实验结果表明,所提网络的参数量仅为Darknet53网络的5%,在钢筋测试集上以106.8 FPS的速度达到了92.7%的mAP。与现有的EfficientDet、SSD、CenterNet、RetinaNet、Faster RCNN、YOLOv3、YOLOv4和YOLOv5m等8个目标检测网络相比,RebarNet具有更短的训练时间(24.5 s)、最低的显存占用(1 956 MB)、最小的模型权重文件(13 MB)。与目前效果最好的YOLOv5m网络相比,RebarNet的mAP略低0.4个百分点,然而其检测速度上升了48 FPS,是YOLOv5m网络的1.8倍。以上结果表明,所提出的网络有助于完成智慧工地中要求实现的高效、准确的钢筋检测任务。

关键词: 钢筋检测, YOLOv3, 注意力机制, 特征金字塔, 轻量级网络

Abstract:

There are limited memory and computing power of the equipment in smart construction sites, making it very difficult to detect rebar in real time through object detection on the on-site equipment. The slow speed of rebar detection and the high cost of model deployment of this equipment also bring great challenges. In order to solve the problems, RebarNet, a lightweight network for rebar detection with attention mechanism was proposed on the basis of YOLOv3 (You Only Look Once version 3). Firstly, the residual block was used as the basic unit of the network to construct a feature extraction structure to extract local and contextual information. Secondly, Channel Attention (CA) module and Spatial Attention (SA) module were added to the residual block to adjust the attention weight of the feature map and improve the ability of the network to extract features. Thirdly, the feature pyramid fusion module was used to increase the receptive field of the network and optimize the extraction effect of the medium-sized rebar images. Finally, the feature map of 52×52 channel was output for post-processing and rebar detection after 8 times downsampling. Experimental results show that the parameter amount of the proposed network is only 5% of that of Darknet53 network, and mAP (mean Average Precision) of the proposed network achieves 92.7% at the speed of 106.8 FPS (Frames Per Second) on the rebar test dataset. Compared with the existing 8 object detection networks including EfficientDet (Scalable and Efficient Object Detection), SSD (Single Shot MultiBox Detector), CenterNet, RetinaNet, Faster RCNN (Faster Region-CNN), YOLOv3, YOLOv4 and YOLOv5m (YOLOv5 medium), RebarNet has a shorter training time (24.5 seconds), the lowest memory usage (1 956 MB), and the smallest model weight file (13 MB). Compared with the current best-performing YOLOv5m network, RebarNet has the mAP slightly lower by 0.4 percentage points with the detection speed increased by 48 FPS, which is 1.8 times of that of YOLOv5m network. The above indicates that the proposed network helps to complete the task of high-efficiency and accurate rebar detection in smart construction sites.

Key words: rebar detection, YOLOv3, attention mechanism, feature pyramid, lightweight network

中图分类号: