Journal of Computer Applications ›› 2022, Vol. 42 ›› Issue (9): 2900-2908.DOI: 10.11772/j.issn.1001-9081.2021071136
• Multimedia computing and computer simulation • Previous Articles Next Articles
Received:
2021-07-01
Revised:
2021-09-13
Accepted:
2021-09-15
Online:
2021-09-22
Published:
2022-09-10
Contact:
Lizhi LIU
About author:
LI Yaoshun, born in 1998, M. S. candidate. His research interests include deep learning, object detection.
Supported by:
通讯作者:
刘黎志
作者简介:
李姚舜(1998—),男,湖北荆州人,硕士研究生,主要研究方向:深度学习、目标检测;
基金资助:
CLC Number:
Yaoshun LI, Lizhi LIU. Lightweight network for rebar detection with attention mechanism[J]. Journal of Computer Applications, 2022, 42(9): 2900-2908.
李姚舜, 刘黎志. 嵌入注意力机制的轻量级钢筋检测网络[J]. 《计算机应用》唯一官方网站, 2022, 42(9): 2900-2908.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2021071136
通道 | Anchor框 |
---|---|
13 | [116, 90], [156, 198], [373, 326] |
26 | [ |
52 | [ |
Tab. 1 Anchors of YOLOv3
通道 | Anchor框 |
---|---|
13 | [116, 90], [156, 198], [373, 326] |
26 | [ |
52 | [ |
目标检测网络 | 卷积 层数 | 总训练 参数/104 | 占用 显存/MB | 模型 权重/MB |
---|---|---|---|---|
EfficientDet | 176 | 359 | 2 927 | 15 |
SSD | 35 | 2 374 | 3 146 | 91 |
CenterNet | 62 | 3 266 | 2 166 | 124 |
RetinaNet | 53 | 2 350 | 3 919 | 138 |
Faster RCNN | 43 | 854 | 5 230 | 108 |
YOLOv3 | 75 | 6 157 | 5 192 | 235 |
YOLOv4 | 182 | 6 393 | 4 184 | 244 |
YOLOv5m | 94 | 2 156 | 4 876 | 57 |
本文网络 | 30 | 347 | 1 956 | 13 |
Tab. 2 Comparison of model parameters
目标检测网络 | 卷积 层数 | 总训练 参数/104 | 占用 显存/MB | 模型 权重/MB |
---|---|---|---|---|
EfficientDet | 176 | 359 | 2 927 | 15 |
SSD | 35 | 2 374 | 3 146 | 91 |
CenterNet | 62 | 3 266 | 2 166 | 124 |
RetinaNet | 53 | 2 350 | 3 919 | 138 |
Faster RCNN | 43 | 854 | 5 230 | 108 |
YOLOv3 | 75 | 6 157 | 5 192 | 235 |
YOLOv4 | 182 | 6 393 | 4 184 | 244 |
YOLOv5m | 94 | 2 156 | 4 876 | 57 |
本文网络 | 30 | 347 | 1 956 | 13 |
数据集 | 图片数量 | 标记文件 | 用途 |
---|---|---|---|
Train | 225 | train.txt | 模型训练 |
Val | 25 | val.txt | 模型训练中mAP计算 |
Test | 200 | 手工点数,用于模型Accuracy、FPS评价 |
Tab. 3 Partition and usage of dataset
数据集 | 图片数量 | 标记文件 | 用途 |
---|---|---|---|
Train | 225 | train.txt | 模型训练 |
Val | 25 | val.txt | 模型训练中mAP计算 |
Test | 200 | 手工点数,用于模型Accuracy、FPS评价 |
目标检测网络 | TrainTime/s | mAP | Accuracy | FPS/(frame∙s-1) |
---|---|---|---|---|
EfficientDet | 21.5 | 0.010 | 0.056 | 17.3 |
SSD | 20.5 | 0.117 | 0.227 | 45.7 |
CenterNet | 23.8 | 0.123 | 0.278 | 43.4 |
RetinaNet | 36.2 | 0.462 | 0.504 | 21.8 |
Faster RCNN | 97.8 | 0.682 | 0.517 | 11.2 |
YOLOv3 | 43.4 | 0.889 | 0.887 | 38.2 |
YOLOv4 | 27.5 | 0.916 | 0.923 | 26.8 |
YOLOv5m | 31.1 | 0.931 | 0.933 | 58.1 |
本文网络 | 24.5 | 0.927 | 0.931 | 106.8 |
Tab. 4 Evaluation indexes of different networks
目标检测网络 | TrainTime/s | mAP | Accuracy | FPS/(frame∙s-1) |
---|---|---|---|---|
EfficientDet | 21.5 | 0.010 | 0.056 | 17.3 |
SSD | 20.5 | 0.117 | 0.227 | 45.7 |
CenterNet | 23.8 | 0.123 | 0.278 | 43.4 |
RetinaNet | 36.2 | 0.462 | 0.504 | 21.8 |
Faster RCNN | 97.8 | 0.682 | 0.517 | 11.2 |
YOLOv3 | 43.4 | 0.889 | 0.887 | 38.2 |
YOLOv4 | 27.5 | 0.916 | 0.923 | 26.8 |
YOLOv5m | 31.1 | 0.931 | 0.933 | 58.1 |
本文网络 | 24.5 | 0.927 | 0.931 | 106.8 |
1 | 刘士林,毕晓航,赖丹馨. 战“疫”:智慧城市显身手[J]. 中国建设信息化, 2020(7): 22-24. |
LIU S L, BI X H, LAI D X. Fighting the epidemic: smart cities play a huge role[J]. Informatization of China Construction, 2020(7): 22-24. | |
2 | GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic[C]// Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2014: 580-587. 10.1109/cvpr.2014.81 |
3 | GIRSHICK R. Fast R-CNN[C]// Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2015: 1440-1448. 10.1109/iccv.2015.169 |
4 | REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. 10.1109/tpami.2016.2577031 |
5 | HE K M, GKIOXARI G, DOLLÁR P, et al. Mask R-CNN[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 2980-2988. 10.1109/iccv.2017.322 |
6 | LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot multibox detector[C]// Proceedings of the 2016 European Conference on Computer Vision, LNCS 9905. Cham: Springer, 2016: 21-37. |
7 | FU C Y, LIU W, RANGA A, et al. DSSD: deconvolutional single shot detector[EB/OL]. (2017-01-23) [2021-06-20].. |
8 | REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 779-788. 10.1109/cvpr.2016.91 |
9 | REDMON J, FARHADI A. YOLO9000: better, faster, stronger[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 6517-6525. 10.1109/cvpr.2017.690 |
10 | REDMON J, FARHADI A. YOLOv3: an incremental improvement[EB/OL]. (2018-04-08) [2021-06-15].. |
11 | BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: optimal speed and accuracy of object detection[EB/OL]. (2020-04-23) [2021-06-20].. |
12 | LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 2999-3007. 10.1109/iccv.2017.324 |
13 | LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 936-944. 10.1109/cvpr.2017.106 |
14 | LAW H, DENG J. CornerNet: detecting objects as paired keypoints[J]. International Journal of Computer Vision, 2020, 128(3): 642-656. 10.1007/s11263-019-01204-1 |
15 | DUAN K W, BAI S, XIE L X, et al. CenterNet: keypoint triplets for object detection[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 6568-6577. 10.1109/iccv.2019.00667 |
16 | LAW H, TENG Y, RUSSAKOVSKY O, et al. CornerNet-Lite: efficient keypoint based object detection[C]// Proceedings of the 2020 British Machine Vision Conference. Durham: BMVA Press, 2020: No.16. |
17 | TIAN Z, SHEN C H, CHEN H, et al. FCOS: fully convolutional one-stage object detection[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2020: 9626-9635. 10.1109/iccv.2019.00972 |
18 | ZHOU X Y, WANG D Q, KRÄHENBÜHL P. Objects as points[EB/OL]. (2019-04-25) [2021-06-13].. |
19 | ZHANG Y, TIŇO P, LEONARDIS A, et al. A survey on neural network interpretability[J]. IEEE Transactions on Emerging Topics in Computational Intelligence, 2021, 5(5): 726-742. 10.1109/tetci.2021.3100641 |
20 | ITTI L, KOCH C, NIEBUR E. A model of saliency-based visual attention for rapid scene analysis[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1998, 20(11): 1254-1259. 10.1109/34.730558 |
21 | LIU N, HAN J W. DHSNet: deep hierarchical saliency network for salient object detection[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 678-686. 10.1109/cvpr.2016.80 |
22 | FENG M Y, LU H C, DING E R. Attentive feedback network for boundary-aware salient object detection[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 1623-1632. 10.1109/cvpr.2019.00172 |
23 | HU J, LI S, SUN G. Squeeze-and-excitation networks[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 7132-7141. 10.1109/cvpr.2018.00745 |
24 | LI X, WANG W H, HU X L, et al. Selective kernel networks[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 510-519. 10.1109/cvpr.2019.00060 |
25 | WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]// Proceedings of the 2018 European Conference on Computer Vision, LNCS 11211. Cham: Springer, 2018: 3-19. |
26 | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook, NY: Curran Associates Inc., 2017: 6000-6010. |
27 | PARMAR N, VASWANI A, USZKOREIT J, et al. Image transformer[C]// Proceedings of the 35th International Conference on Machine Learning. New York: JMLR.org, 2018: 4055-4064. |
28 | DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16×16 words: transformers for image recognition at scale[EB/OL]. (2021-06-03) [2021-06-25].. |
29 | CARION N, MASSA F, SYNNAEVE G, et al. End-to-end object detection with transformers[C]// Proceedings of the 2020 European Conference on Computer Vision, LNCS 12346. Cham: Springer, 2020: 213-229. |
30 | BEAL J, KIM E, TZENG E, et al. Toward transformer-based object detection[EB/OL]. (2020-12-17) [2021-06-25].. |
31 | ZHANG X M, MA M, HE T T, et al. Steel bars counting method based on image and video processing[C]// Proceedings of the 2017 International Symposium on Intelligent Signal Processing and Communication Systems. Piscataway: IEEE, 2017: 304-309. 10.1109/ispacs.2017.8266493 |
32 | WANG H, POLDEN J, JIRGENS J, et al. Automatic rebar counting using image processing and machine learning[C]// Proceedings of the IEEE 9th Annual International Conference on Cyber Technology in Automation, Control, and Intelligent Systems. Piscataway: IEEE, 2019: 900-904. 10.1109/cyber46603.2019.9066509 |
33 | 刘赛,李兴璨,李航,等. 基于AI技术的钢筋数量识别技术研究[J]. 居舍, 2020(6):27. |
LIU S, LI X C, LI H, et al. Research on rebar number identification technology based on AI technology[J]. Housing, 2020(6): 27. | |
34 | QU F, LI C M, PENG K, et al. Research on detection and identification of dense rebar based on lightweight network[C]// Proceedings of the 2020 International Conference of Pioneering Computer Scientists, Engineers and Educators, CCIS 1257. Singapore: Springer, 2020: 440-446. |
35 | ZHU Y J, TANG C L, LIU H, et al. End-face localization and segmentation of steel bar based on convolution neural network[J]. IEEE Access, 2020, 8: 74679-74690. 10.1109/access.2020.2989300 |
36 | HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 770-778. 10.1109/cvpr.2016.90 |
37 | NEUBECK A, VAN GOOL L. Efficient non-maximum suppression[C]// Proceedings of the 18th International Conference on Pattern Recognition. Piscataway: IEEE, 2006: 850-855. 10.1109/icpr.2006.479 |
38 | TAN M X, PANG R M, LE Q V. EfficientDet: scalable and efficient object detection[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 10778-10787. 10.1109/cvpr42600.2020.01079 |
[1] | Jing QIN, Zhiguang QIN, Fali LI, Yueheng PENG. Diagnosis of major depressive disorder based on probabilistic sparse self-attention neural network [J]. Journal of Computer Applications, 2024, 44(9): 2970-2974. |
[2] | Liting LI, Bei HUA, Ruozhou HE, Kuang XU. Multivariate time series prediction model based on decoupled attention mechanism [J]. Journal of Computer Applications, 2024, 44(9): 2732-2738. |
[3] | Zhiqiang ZHAO, Peihong MA, Xinhong HEI. Crowd counting method based on dual attention mechanism [J]. Journal of Computer Applications, 2024, 44(9): 2886-2892. |
[4] | Kaipeng XUE, Tao XU, Chunjie LIAO. Multimodal sentiment analysis network with self-supervision and multi-layer cross attention [J]. Journal of Computer Applications, 2024, 44(8): 2387-2392. |
[5] | Pengqi GAO, Heming HUANG, Yonghong FAN. Fusion of coordinate and multi-head attention mechanisms for interactive speech emotion recognition [J]. Journal of Computer Applications, 2024, 44(8): 2400-2406. |
[6] | Zhonghua LI, Yunqi BAI, Xuejin WANG, Leilei HUANG, Chujun LIN, Shiyu LIAO. Low illumination face detection based on image enhancement [J]. Journal of Computer Applications, 2024, 44(8): 2588-2594. |
[7] | Shangbin MO, Wenjun WANG, Ling DONG, Shengxiang GAO, Zhengtao YU. Single-channel speech enhancement based on multi-channel information aggregation and collaborative decoding [J]. Journal of Computer Applications, 2024, 44(8): 2611-2617. |
[8] | Wu XIONG, Congjun CAO, Xuefang SONG, Yunlong SHAO, Xusheng WANG. Handwriting identification method based on multi-scale mixed domain attention mechanism [J]. Journal of Computer Applications, 2024, 44(7): 2225-2232. |
[9] | Zhangjian JI, Na DU. Tiny target detection based on improved VariFocalNet [J]. Journal of Computer Applications, 2024, 44(7): 2200-2207. |
[10] | Huanhuan LI, Tianqiang HUANG, Xuemei DING, Haifeng LUO, Liqing HUANG. Public traffic demand prediction based on multi-scale spatial-temporal graph convolutional network [J]. Journal of Computer Applications, 2024, 44(7): 2065-2072. |
[11] | Dianhui MAO, Xuebo LI, Junling LIU, Denghui ZHANG, Wenjing YAN. Chinese entity and relation extraction model based on parallel heterogeneous graph and sequential attention mechanism [J]. Journal of Computer Applications, 2024, 44(7): 2018-2025. |
[12] | Li LIU, Haijin HOU, Anhong WANG, Tao ZHANG. Generative data hiding algorithm based on multi-scale attention [J]. Journal of Computer Applications, 2024, 44(7): 2102-2109. |
[13] | Song XU, Wenbo ZHANG, Yifan WANG. Lightweight video salient object detection network based on spatiotemporal information [J]. Journal of Computer Applications, 2024, 44(7): 2192-2199. |
[14] | Dahai LI, Zhonghua WANG, Zhendong WANG. Dual-branch low-light image enhancement network combining spatial and frequency domain information [J]. Journal of Computer Applications, 2024, 44(7): 2175-2182. |
[15] | Yongjin ZHANG, Jian XU, Mingxing ZHANG. Lightweight algorithm for impurity detection in raw cotton based on improved YOLOv7 [J]. Journal of Computer Applications, 2024, 44(7): 2271-2278. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||