Journal of Computer Applications ›› 2024, Vol. 44 ›› Issue (5): 1605-1612.DOI: 10.11772/j.issn.1001-9081.2023050687
Special Issue: 多媒体计算与计算机仿真
• Multimedia computing and computer simulation • Previous Articles Next Articles
Xiaogang SONG1,2(), Dongdong ZHANG1, Pengfei ZHANG1, Li LIANG1, Xinhong HEI1,2
Received:
2023-05-30
Revised:
2023-09-12
Accepted:
2023-09-14
Online:
2023-09-19
Published:
2024-05-10
Contact:
Xiaogang SONG
About author:
ZHANG Dongdong, born in 1998, M. S. candidate. His research interests include object detection, video action recognition.Supported by:
宋霄罡1,2(), 张冬冬1, 张鹏飞1, 梁莉1, 黑新宏1,2
通讯作者:
宋霄罡
作者简介:
张冬冬(1998—),男,湖南郴州人,硕士研究生,主要研究方向:目标检测、视频行为识别基金资助:
CLC Number:
Xiaogang SONG, Dongdong ZHANG, Pengfei ZHANG, Li LIANG, Xinhong HEI. Real-time object detection algorithm for complex construction environments[J]. Journal of Computer Applications, 2024, 44(5): 1605-1612.
宋霄罡, 张冬冬, 张鹏飞, 梁莉, 黑新宏. 面向复杂施工环境的实时目标检测算法[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1605-1612.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2023050687
名称 | AP0.5/% | mAP0.5/% | 参数量/106 | GFLOPs | FPS | ||||
---|---|---|---|---|---|---|---|---|---|
罐车 | 人 | 车牌 | 水管 | 水桶 | |||||
Faster R-CNN[ | 89.3 | 86.9 | 87.1 | 43.9 | 85.8 | 78.6 | 27.82 | 130.7 | 8 |
SSD[ | 90.5 | 87.6 | 88.4 | 46.2 | 83.6 | 79.2 | 26.28 | 35.1 | 64 |
YOLOv4-tiny[ | 92.3 | 88.4 | 87.9 | 45.7 | 87.4 | 80.3 | 6.06 | 6.9 | 137 |
YOLOXs[ | 96.9 | 94.7 | 94.5 | 53.6 | 93.8 | 86.7 | 9.02 | 26.8 | 94 |
YOLOv5s | 99.1 | 96.4 | 97.4 | 56.9 | 96.4 | 89.2 | 7.02 | 15.8 | 163 |
PP-YOLOEs[ | 99.0 | 96.3 | 98.1 | 63.8 | 96.4 | 90.7 | 7.93 | 17.4 | 150 |
YOLO-C | 99.2 | 96.7 | 98.7 | 65.4 | 97.6 | 91.5 | 4.16 | 16.9 | 159 |
Tab. 1 Performance comparison of different algorithms on self-built dataset
名称 | AP0.5/% | mAP0.5/% | 参数量/106 | GFLOPs | FPS | ||||
---|---|---|---|---|---|---|---|---|---|
罐车 | 人 | 车牌 | 水管 | 水桶 | |||||
Faster R-CNN[ | 89.3 | 86.9 | 87.1 | 43.9 | 85.8 | 78.6 | 27.82 | 130.7 | 8 |
SSD[ | 90.5 | 87.6 | 88.4 | 46.2 | 83.6 | 79.2 | 26.28 | 35.1 | 64 |
YOLOv4-tiny[ | 92.3 | 88.4 | 87.9 | 45.7 | 87.4 | 80.3 | 6.06 | 6.9 | 137 |
YOLOXs[ | 96.9 | 94.7 | 94.5 | 53.6 | 93.8 | 86.7 | 9.02 | 26.8 | 94 |
YOLOv5s | 99.1 | 96.4 | 97.4 | 56.9 | 96.4 | 89.2 | 7.02 | 15.8 | 163 |
PP-YOLOEs[ | 99.0 | 96.3 | 98.1 | 63.8 | 96.4 | 90.7 | 7.93 | 17.4 | 150 |
YOLO-C | 99.2 | 96.7 | 98.7 | 65.4 | 97.6 | 91.5 | 4.16 | 16.9 | 159 |
编号 | GhostConv | 多尺度检测 | CSA | VariFocal Loss | mAP0.5/% | 参数量/106 | GFLOPs | FPS |
---|---|---|---|---|---|---|---|---|
0 | 89.2 | 7.02 | 15.8 | 163 | ||||
1 | √ | 88.6 | 3.69 | 8.2 | 192 | |||
2 | √ | 90.0 | 7.69 | 27.1 | 96 | |||
3 | √ | 89.6 | 7.11 | 18.4 | 128 | |||
4 | √ | 89.8 | 7.02 | 15.7 | 158 | |||
5 | √ | √ | 89.1 | 4.03 | 13.8 | 125 | ||
6 | √ | √ | √ | 91.2 | 4.16 | 16.9 | 157 | |
7 | √ | √ | √ | √ | 91.5 | 4.16 | 16.9 | 159 |
Tab. 2 Ablation experimental results of proposed algorithm on self-built dataset
编号 | GhostConv | 多尺度检测 | CSA | VariFocal Loss | mAP0.5/% | 参数量/106 | GFLOPs | FPS |
---|---|---|---|---|---|---|---|---|
0 | 89.2 | 7.02 | 15.8 | 163 | ||||
1 | √ | 88.6 | 3.69 | 8.2 | 192 | |||
2 | √ | 90.0 | 7.69 | 27.1 | 96 | |||
3 | √ | 89.6 | 7.11 | 18.4 | 128 | |||
4 | √ | 89.8 | 7.02 | 15.7 | 158 | |||
5 | √ | √ | 89.1 | 4.03 | 13.8 | 125 | ||
6 | √ | √ | √ | 91.2 | 4.16 | 16.9 | 157 | |
7 | √ | √ | √ | √ | 91.5 | 4.16 | 16.9 | 159 |
1 | KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[J]. Communication of the ACM, 2017, 60(6): 84–90. 10.1145/3065386 |
2 | REN S, HE K, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. 10.1109/tpami.2016.2577031 |
3 | LIN T-Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 936-944. 10.1109/cvpr.2017.106 |
4 | HE K, GKIOXARI G, DOLLÁR P, et al. Mask R-CNN[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 2980-2988. 10.1109/iccv.2017.322 |
5 | CAI Z, VASCONCELOS N. Cascade R-CNN: delving into high quality object detection[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 6154-6162. 10.1109/cvpr.2018.00644 |
6 | REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]// Proceedings of the 2016 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 779-788. 10.1109/cvpr.2016.91 |
7 | LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot MultiBox detector[C]// Proceeding of the 14th European Conference on Computer Vision. Berlin: Springer, 2016: 21-37. 10.1007/978-3-319-46448-0_2 |
8 | REDMON J, FARHADI A. YOLO9000: better, faster, stronger[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 6517-6525. 10.1109/cvpr.2017.690 |
9 | LIN T-Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 2980-2988. 10.1109/iccv.2017.324 |
10 | REDMON J, FARHADI A. YOLOV3: an incremental improvement [EB/OL]. (2018-04-08) [2023-05-07]. . 10.1109/cvpr.2017.690 |
11 | BOCHKOVSKIY A, WANG C-Y, LIAO H-Y M. YOLOv4: optimal speed and accuracy of object detection[EB/OL]. (2020-04-23) [2023-05-07]. . |
12 | GE Z, LIU S, WANG F, et al. YOLOX: exceeding YOLO series in 2021[EB/OL]. (2021-08-06) [2023-05-07]. . |
13 | XU S, WANG X, LV W, et al. PP-YOLOE: an evolved version of YOLO[EB/OL]. (2022-12-12) [2023-05-07]. . |
14 | 刘辉,张琳玉,王复港,等.基于注意力机制和上下文信息的目标检测算法[J].计算机应用,2023,43(5):1557-1564. |
LIU H, ZHANG L Y, WANG F G, et al. Object detection algorithm based on attention mechanism and context information[J]. Journal of Computer Applications, 2023, 43(5): 1557-1564. | |
15 | 李佳东,张丹普,范亚琼,等.基于改进YOLOv5的轻量级船舶目标检测算法[J].计算机应用,2023,43(3):923-929. 10.11772/j.issn.1001-9081.2022071096 |
LI J D, ZHANG D P, FAN Y Q, et al. Lightweight ship target detection algorithm based on improved YOLOv5[J]. Journal of Computer Applications, 2023, 43(3): 923-929. 10.11772/j.issn.1001-9081.2022071096 | |
16 | 王怀济,李广明,张红良,等.融合卷积通道注意力的遥感图像目标检测方法[J/OL].计算机工程与应用:1-14 [2023-05-27]. . 10.3778/j.issn.1002-8331.2211-0037 |
WANG H J, LI G M, ZHANG H L, et al. Rotating object detection method based on convolutional block channel attention in remote sensing images[J/OL]. Computer Engineering and Applications:1-14 [2023-05-27]. . 10.3778/j.issn.1002-8331.2211-0037 | |
17 | 盛博莹,侯进,李嘉新,等.面向复杂交通场景的道路目标检测方法[J].计算机工程与应用,2023,59(15):87-96. 10.3778/j.issn.1002-8331.2212-0093 |
SHENG B Y, HOU J, LI J X, et al. Road object detection method for complex road scenes[J]. Computer Engineering and Applications,2023,59(15):87-96. 10.3778/j.issn.1002-8331.2212-0093 | |
18 | WANG C-Y, LIAO H-Y M, WU Y-H, et al. CSPNet: a new backbone that can enhance learning capability of CNN[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Piscataway: IEEE, 2020: 1571-1580. 10.1109/cvprw50498.2020.00203 |
19 | HE K, ZHANG X, REN S, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904-1916. 10.1109/tpami.2015.2389824 |
20 | HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 7132-7141. 10.1109/cvpr.2018.00745 |
21 | ZHANG H, WANG Y, DAYOUB F, et al. VarifocalNet: an IoU-aware dense object detector[C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE,2021: 8514-8523. 10.1109/cvpr46437.2021.00841 |
22 | HAN K, WANG Y, TIAN Q, et al. GhostNet: more features from cheap operations[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 1580-1589. 10.1109/cvpr42600.2020.00165 |
[1] | Zhiqiang ZHAO, Peihong MA, Xinhong HEI. Crowd counting method based on dual attention mechanism [J]. Journal of Computer Applications, 2024, 44(9): 2886-2892. |
[2] | Jing QIN, Zhiguang QIN, Fali LI, Yueheng PENG. Diagnosis of major depressive disorder based on probabilistic sparse self-attention neural network [J]. Journal of Computer Applications, 2024, 44(9): 2970-2974. |
[3] | Liting LI, Bei HUA, Ruozhou HE, Kuang XU. Multivariate time series prediction model based on decoupled attention mechanism [J]. Journal of Computer Applications, 2024, 44(9): 2732-2738. |
[4] | Yanjun LI, Yaodong GE, Qi WANG, Weiguo ZHANG, Chen LIU. Improved KLEIN algorithm and its quantum analysis [J]. Journal of Computer Applications, 2024, 44(9): 2810-2817. |
[5] | Kaipeng XUE, Tao XU, Chunjie LIAO. Multimodal sentiment analysis network with self-supervision and multi-layer cross attention [J]. Journal of Computer Applications, 2024, 44(8): 2387-2392. |
[6] | Pengqi GAO, Heming HUANG, Yonghong FAN. Fusion of coordinate and multi-head attention mechanisms for interactive speech emotion recognition [J]. Journal of Computer Applications, 2024, 44(8): 2400-2406. |
[7] | Zhonghua LI, Yunqi BAI, Xuejin WANG, Leilei HUANG, Chujun LIN, Shiyu LIAO. Low illumination face detection based on image enhancement [J]. Journal of Computer Applications, 2024, 44(8): 2588-2594. |
[8] | Shangbin MO, Wenjun WANG, Ling DONG, Shengxiang GAO, Zhengtao YU. Single-channel speech enhancement based on multi-channel information aggregation and collaborative decoding [J]. Journal of Computer Applications, 2024, 44(8): 2611-2617. |
[9] | Li LIU, Haijin HOU, Anhong WANG, Tao ZHANG. Generative data hiding algorithm based on multi-scale attention [J]. Journal of Computer Applications, 2024, 44(7): 2102-2109. |
[10] | Song XU, Wenbo ZHANG, Yifan WANG. Lightweight video salient object detection network based on spatiotemporal information [J]. Journal of Computer Applications, 2024, 44(7): 2192-2199. |
[11] | Dahai LI, Zhonghua WANG, Zhendong WANG. Dual-branch low-light image enhancement network combining spatial and frequency domain information [J]. Journal of Computer Applications, 2024, 44(7): 2175-2182. |
[12] | Yongjin ZHANG, Jian XU, Mingxing ZHANG. Lightweight algorithm for impurity detection in raw cotton based on improved YOLOv7 [J]. Journal of Computer Applications, 2024, 44(7): 2271-2278. |
[13] | Wenliang WEI, Yangping WANG, Biao YUE, Anzheng WANG, Zhe ZHANG. Deep learning model for infrared and visible image fusion based on illumination weight allocation and attention [J]. Journal of Computer Applications, 2024, 44(7): 2183-2191. |
[14] | Wu XIONG, Congjun CAO, Xuefang SONG, Yunlong SHAO, Xusheng WANG. Handwriting identification method based on multi-scale mixed domain attention mechanism [J]. Journal of Computer Applications, 2024, 44(7): 2225-2232. |
[15] | Huanhuan LI, Tianqiang HUANG, Xuemei DING, Haifeng LUO, Liqing HUANG. Public traffic demand prediction based on multi-scale spatial-temporal graph convolutional network [J]. Journal of Computer Applications, 2024, 44(7): 2065-2072. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||