《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (9): 2900-2908.DOI: 10.11772/j.issn.1001-9081.2021071136
• 多媒体计算与计算机仿真 • 上一篇
收稿日期:
2021-07-01
修回日期:
2021-09-13
接受日期:
2021-09-15
发布日期:
2021-09-22
出版日期:
2022-09-10
通讯作者:
刘黎志
作者简介:
李姚舜(1998—),男,湖北荆州人,硕士研究生,主要研究方向:深度学习、目标检测;
基金资助:
Received:
2021-07-01
Revised:
2021-09-13
Accepted:
2021-09-15
Online:
2021-09-22
Published:
2022-09-10
Contact:
Lizhi LIU
About author:
LI Yaoshun, born in 1998, M. S. candidate. His research interests include deep learning, object detection.
Supported by:
摘要:
智慧工地中的设备内存和计算能力有限,在现场的设备上通过目标检测对钢筋进行实时检测具有很大的难度,而且其钢筋检测速度慢、模型部署成本高。针对这些问题,在YOLOv3网络的基础上,提出了一个嵌入注意力机制的轻量级钢筋检测网络RebarNet。首先,利用残差块作为网络的基本单元来构建特征提取结构,并用其提取局部和上下文信息;其次,在残差块中添加通道注意力(CA)模块和空间注意力(SA)模块,以调整特征图的注意力权重,并提升网络提取特征的能力;然后,采用特征金字塔融合模块,以增大网络的感受野,并优化中等钢筋图像的提取效果;最后,输出经过8倍下采样后的52×52通道的特征图用于后处理和钢筋检测。实验结果表明,所提网络的参数量仅为Darknet53网络的5%,在钢筋测试集上以106.8 FPS的速度达到了92.7%的mAP。与现有的EfficientDet、SSD、CenterNet、RetinaNet、Faster RCNN、YOLOv3、YOLOv4和YOLOv5m等8个目标检测网络相比,RebarNet具有更短的训练时间(24.5 s)、最低的显存占用(1 956 MB)、最小的模型权重文件(13 MB)。与目前效果最好的YOLOv5m网络相比,RebarNet的mAP略低0.4个百分点,然而其检测速度上升了48 FPS,是YOLOv5m网络的1.8倍。以上结果表明,所提出的网络有助于完成智慧工地中要求实现的高效、准确的钢筋检测任务。
中图分类号:
李姚舜, 刘黎志. 嵌入注意力机制的轻量级钢筋检测网络[J]. 计算机应用, 2022, 42(9): 2900-2908.
Yaoshun LI, Lizhi LIU. Lightweight network for rebar detection with attention mechanism[J]. Journal of Computer Applications, 2022, 42(9): 2900-2908.
通道 | Anchor框 |
---|---|
13 | [116, 90], [156, 198], [373, 326] |
26 | [ |
52 | [ |
表1 YOLOv3设置Anchor框
Tab. 1 Anchors of YOLOv3
通道 | Anchor框 |
---|---|
13 | [116, 90], [156, 198], [373, 326] |
26 | [ |
52 | [ |
目标检测网络 | 卷积 层数 | 总训练 参数/104 | 占用 显存/MB | 模型 权重/MB |
---|---|---|---|---|
EfficientDet | 176 | 359 | 2 927 | 15 |
SSD | 35 | 2 374 | 3 146 | 91 |
CenterNet | 62 | 3 266 | 2 166 | 124 |
RetinaNet | 53 | 2 350 | 3 919 | 138 |
Faster RCNN | 43 | 854 | 5 230 | 108 |
YOLOv3 | 75 | 6 157 | 5 192 | 235 |
YOLOv4 | 182 | 6 393 | 4 184 | 244 |
YOLOv5m | 94 | 2 156 | 4 876 | 57 |
本文网络 | 30 | 347 | 1 956 | 13 |
表2 模型参数量对比
Tab. 2 Comparison of model parameters
目标检测网络 | 卷积 层数 | 总训练 参数/104 | 占用 显存/MB | 模型 权重/MB |
---|---|---|---|---|
EfficientDet | 176 | 359 | 2 927 | 15 |
SSD | 35 | 2 374 | 3 146 | 91 |
CenterNet | 62 | 3 266 | 2 166 | 124 |
RetinaNet | 53 | 2 350 | 3 919 | 138 |
Faster RCNN | 43 | 854 | 5 230 | 108 |
YOLOv3 | 75 | 6 157 | 5 192 | 235 |
YOLOv4 | 182 | 6 393 | 4 184 | 244 |
YOLOv5m | 94 | 2 156 | 4 876 | 57 |
本文网络 | 30 | 347 | 1 956 | 13 |
数据集 | 图片数量 | 标记文件 | 用途 |
---|---|---|---|
Train | 225 | train.txt | 模型训练 |
Val | 25 | val.txt | 模型训练中mAP计算 |
Test | 200 | 手工点数,用于模型Accuracy、FPS评价 |
表3 数据集划分及用途
Tab. 3 Partition and usage of dataset
数据集 | 图片数量 | 标记文件 | 用途 |
---|---|---|---|
Train | 225 | train.txt | 模型训练 |
Val | 25 | val.txt | 模型训练中mAP计算 |
Test | 200 | 手工点数,用于模型Accuracy、FPS评价 |
目标检测网络 | TrainTime/s | mAP | Accuracy | FPS/(frame∙s-1) |
---|---|---|---|---|
EfficientDet | 21.5 | 0.010 | 0.056 | 17.3 |
SSD | 20.5 | 0.117 | 0.227 | 45.7 |
CenterNet | 23.8 | 0.123 | 0.278 | 43.4 |
RetinaNet | 36.2 | 0.462 | 0.504 | 21.8 |
Faster RCNN | 97.8 | 0.682 | 0.517 | 11.2 |
YOLOv3 | 43.4 | 0.889 | 0.887 | 38.2 |
YOLOv4 | 27.5 | 0.916 | 0.923 | 26.8 |
YOLOv5m | 31.1 | 0.931 | 0.933 | 58.1 |
本文网络 | 24.5 | 0.927 | 0.931 | 106.8 |
表4 不同网络的评测指标
Tab. 4 Evaluation indexes of different networks
目标检测网络 | TrainTime/s | mAP | Accuracy | FPS/(frame∙s-1) |
---|---|---|---|---|
EfficientDet | 21.5 | 0.010 | 0.056 | 17.3 |
SSD | 20.5 | 0.117 | 0.227 | 45.7 |
CenterNet | 23.8 | 0.123 | 0.278 | 43.4 |
RetinaNet | 36.2 | 0.462 | 0.504 | 21.8 |
Faster RCNN | 97.8 | 0.682 | 0.517 | 11.2 |
YOLOv3 | 43.4 | 0.889 | 0.887 | 38.2 |
YOLOv4 | 27.5 | 0.916 | 0.923 | 26.8 |
YOLOv5m | 31.1 | 0.931 | 0.933 | 58.1 |
本文网络 | 24.5 | 0.927 | 0.931 | 106.8 |
1 | 刘士林,毕晓航,赖丹馨. 战“疫”:智慧城市显身手[J]. 中国建设信息化, 2020(7): 22-24. |
LIU S L, BI X H, LAI D X. Fighting the epidemic: smart cities play a huge role[J]. Informatization of China Construction, 2020(7): 22-24. | |
2 | GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic[C]// Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2014: 580-587. 10.1109/cvpr.2014.81 |
3 | GIRSHICK R. Fast R-CNN[C]// Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2015: 1440-1448. 10.1109/iccv.2015.169 |
4 | REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. 10.1109/tpami.2016.2577031 |
5 | HE K M, GKIOXARI G, DOLLÁR P, et al. Mask R-CNN[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 2980-2988. 10.1109/iccv.2017.322 |
6 | LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot multibox detector[C]// Proceedings of the 2016 European Conference on Computer Vision, LNCS 9905. Cham: Springer, 2016: 21-37. |
7 | FU C Y, LIU W, RANGA A, et al. DSSD: deconvolutional single shot detector[EB/OL]. (2017-01-23) [2021-06-20].. |
8 | REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 779-788. 10.1109/cvpr.2016.91 |
9 | REDMON J, FARHADI A. YOLO9000: better, faster, stronger[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 6517-6525. 10.1109/cvpr.2017.690 |
10 | REDMON J, FARHADI A. YOLOv3: an incremental improvement[EB/OL]. (2018-04-08) [2021-06-15].. |
11 | BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: optimal speed and accuracy of object detection[EB/OL]. (2020-04-23) [2021-06-20].. |
12 | LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 2999-3007. 10.1109/iccv.2017.324 |
13 | LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 936-944. 10.1109/cvpr.2017.106 |
14 | LAW H, DENG J. CornerNet: detecting objects as paired keypoints[J]. International Journal of Computer Vision, 2020, 128(3): 642-656. 10.1007/s11263-019-01204-1 |
15 | DUAN K W, BAI S, XIE L X, et al. CenterNet: keypoint triplets for object detection[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 6568-6577. 10.1109/iccv.2019.00667 |
16 | LAW H, TENG Y, RUSSAKOVSKY O, et al. CornerNet-Lite: efficient keypoint based object detection[C]// Proceedings of the 2020 British Machine Vision Conference. Durham: BMVA Press, 2020: No.16. |
17 | TIAN Z, SHEN C H, CHEN H, et al. FCOS: fully convolutional one-stage object detection[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2020: 9626-9635. 10.1109/iccv.2019.00972 |
18 | ZHOU X Y, WANG D Q, KRÄHENBÜHL P. Objects as points[EB/OL]. (2019-04-25) [2021-06-13].. |
19 | ZHANG Y, TIŇO P, LEONARDIS A, et al. A survey on neural network interpretability[J]. IEEE Transactions on Emerging Topics in Computational Intelligence, 2021, 5(5): 726-742. 10.1109/tetci.2021.3100641 |
20 | ITTI L, KOCH C, NIEBUR E. A model of saliency-based visual attention for rapid scene analysis[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1998, 20(11): 1254-1259. 10.1109/34.730558 |
21 | LIU N, HAN J W. DHSNet: deep hierarchical saliency network for salient object detection[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 678-686. 10.1109/cvpr.2016.80 |
22 | FENG M Y, LU H C, DING E R. Attentive feedback network for boundary-aware salient object detection[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 1623-1632. 10.1109/cvpr.2019.00172 |
23 | HU J, LI S, SUN G. Squeeze-and-excitation networks[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 7132-7141. 10.1109/cvpr.2018.00745 |
24 | LI X, WANG W H, HU X L, et al. Selective kernel networks[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 510-519. 10.1109/cvpr.2019.00060 |
25 | WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]// Proceedings of the 2018 European Conference on Computer Vision, LNCS 11211. Cham: Springer, 2018: 3-19. |
26 | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook, NY: Curran Associates Inc., 2017: 6000-6010. |
27 | PARMAR N, VASWANI A, USZKOREIT J, et al. Image transformer[C]// Proceedings of the 35th International Conference on Machine Learning. New York: JMLR.org, 2018: 4055-4064. |
28 | DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16×16 words: transformers for image recognition at scale[EB/OL]. (2021-06-03) [2021-06-25].. |
29 | CARION N, MASSA F, SYNNAEVE G, et al. End-to-end object detection with transformers[C]// Proceedings of the 2020 European Conference on Computer Vision, LNCS 12346. Cham: Springer, 2020: 213-229. |
30 | BEAL J, KIM E, TZENG E, et al. Toward transformer-based object detection[EB/OL]. (2020-12-17) [2021-06-25].. |
31 | ZHANG X M, MA M, HE T T, et al. Steel bars counting method based on image and video processing[C]// Proceedings of the 2017 International Symposium on Intelligent Signal Processing and Communication Systems. Piscataway: IEEE, 2017: 304-309. 10.1109/ispacs.2017.8266493 |
32 | WANG H, POLDEN J, JIRGENS J, et al. Automatic rebar counting using image processing and machine learning[C]// Proceedings of the IEEE 9th Annual International Conference on Cyber Technology in Automation, Control, and Intelligent Systems. Piscataway: IEEE, 2019: 900-904. 10.1109/cyber46603.2019.9066509 |
33 | 刘赛,李兴璨,李航,等. 基于AI技术的钢筋数量识别技术研究[J]. 居舍, 2020(6):27. |
LIU S, LI X C, LI H, et al. Research on rebar number identification technology based on AI technology[J]. Housing, 2020(6): 27. | |
34 | QU F, LI C M, PENG K, et al. Research on detection and identification of dense rebar based on lightweight network[C]// Proceedings of the 2020 International Conference of Pioneering Computer Scientists, Engineers and Educators, CCIS 1257. Singapore: Springer, 2020: 440-446. |
35 | ZHU Y J, TANG C L, LIU H, et al. End-face localization and segmentation of steel bar based on convolution neural network[J]. IEEE Access, 2020, 8: 74679-74690. 10.1109/access.2020.2989300 |
36 | HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 770-778. 10.1109/cvpr.2016.90 |
37 | NEUBECK A, VAN GOOL L. Efficient non-maximum suppression[C]// Proceedings of the 18th International Conference on Pattern Recognition. Piscataway: IEEE, 2006: 850-855. 10.1109/icpr.2006.479 |
38 | TAN M X, PANG R M, LE Q V. EfficientDet: scalable and efficient object detection[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 10778-10787. 10.1109/cvpr42600.2020.01079 |
[1] | 文凯, 唐伟伟, 熊俊臣. 基于注意力机制和有效分解卷积的实时分割算法[J]. 《计算机应用》唯一官方网站, 2022, 42(9): 2659-2666. |
[2] | 魏海云, 郑茜颖, 俞金玲. 基于多尺度网络的运动模糊图像复原算法[J]. 《计算机应用》唯一官方网站, 2022, 42(9): 2838-2844. |
[3] | 张文涛, 王园宇, 李赛泽. 基于条件对抗网络的单幅霾图像深度估计模型[J]. 《计算机应用》唯一官方网站, 2022, 42(9): 2865-2875. |
[4] | 侯旭东, 滕飞, 张艺. 基于深度自编码的医疗命名实体识别模型[J]. 《计算机应用》唯一官方网站, 2022, 42(9): 2686-2692. |
[5] | 尹靖涵, 瞿绍军, 姚泽楷, 胡玄烨, 秦晓雨, 华璞靖. 基于YOLOv5的雾霾天气下交通标志识别模型[J]. 《计算机应用》唯一官方网站, 2022, 42(9): 2876-2884. |
[6] | 衡红军, 徐天宝. 基于多尺度卷积和门控机制的注意力情感分析模型[J]. 《计算机应用》唯一官方网站, 2022, 42(9): 2674-2679. |
[7] | 吴明晖, 张广洁, 金苍宏. 基于多模态信息融合的时间序列预测模型[J]. 《计算机应用》唯一官方网站, 2022, 42(8): 2326-2332. |
[8] | 吕振虎, 许新征, 张芳艳. 基于挤压激励的轻量化注意力机制模块[J]. 《计算机应用》唯一官方网站, 2022, 42(8): 2353-2360. |
[9] | 刘亚姣, 于海涛, 王江, 于利峰, 张春晖. 基于深度学习的型钢表面多形态微小缺陷检测算法[J]. 《计算机应用》唯一官方网站, 2022, 42(8): 2601-2608. |
[10] | 张丽莹, 庞春江, 王新颖, 李国亮. 基于改进YOLOv3的多尺度目标检测算法[J]. 《计算机应用》唯一官方网站, 2022, 42(8): 2423-2431. |
[11] | 张新宇, 丁胜, 杨治佩. 基于改进注意力机制的交通标志检测算法[J]. 《计算机应用》唯一官方网站, 2022, 42(8): 2378-2385. |
[12] | 玄英律, 万源, 陈嘉慧. 基于多尺度卷积和注意力机制的LSTM时间序列分类[J]. 《计算机应用》唯一官方网站, 2022, 42(8): 2343-2352. |
[13] | 李坤, 侯庆. 基于注意力机制的轻量型人体姿态估计[J]. 《计算机应用》唯一官方网站, 2022, 42(8): 2407-2414. |
[14] | 徐成霞, 阎庆, 李腾, 苗开超. 基于联合注意力机制的单幅图像去雨算法[J]. 《计算机应用》唯一官方网站, 2022, 42(8): 2578-2585. |
[15] | 王娟, 袁旭亮, 武明虎, 郭力权, 刘子杉. 基于压缩提炼网络的实时语义分割方法[J]. 《计算机应用》唯一官方网站, 2022, 42(7): 1993-2000. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||