Automatic detection and recognition of electric vehicle helmet based on improved YOLOv5s

doi:10.11772/j.issn.1001-9081.2022020313

Abstract

Abstract:

Aiming at the problems of low detection precision， poor robustness， and imperfect related systems in the current small object detection of electric vehicle helmet， an electric vehicle helmet detection model was proposed based on improved YOLOv5s algorithm. In the proposed model， Convolutional Block Attention Module （CBAM） and Coordinate Attention （CA） module were introduced， and the improved Non-Maximum Suppression （NMS） - Distance Intersection over Union-Non Maximum Suppression （DIoU-NMS） was used. At the same time， multi-scale feature fusion detection was added and densely connected network was combined to improve feature extraction effect. Finally， a helmet detection system for electric vehicle drivers was established. The improved YOLOv5s algorithm had the mean Average Precision （mAP） increased by 7.1 percentage points when the Intersection over Union （IoU） is 0.5， and Recall increased by 1.6 percentage points compared with the original YOLOv5s on the self-built electric vehicle helmet wearing dataset. Experimental results show that the improved YOLOv5s algorithm can better meet the requirements for detection precision of electric vehicles and the helmets of their drivers in actual situations， and reduce the incidence rate of electric vehicle traffic accidents to a certain extent.

Key words: electric vehicle helmet detection, YOLOv5s, attention mechanism, Non-Maximum Suppression (NMS), multi-scale feature detection

摘要：

针对目前电动车头盔小目标检测的精度低、鲁棒性差，相关系统不完善等问题，提出了基于改进YOLOv5s的电动车头盔检测算法。所提算法引入卷积块注意力模块（CBAM）和协调注意力（CA）模块，采用改进的非极大值抑制（NMS），即DIoU-NMS（Distance Intersection over Union-Non Maximum Suppression）；同时增加多尺度特征融合检测，并结合密集连接网络改善特征提取效果；最后，建立了电动车驾驶人头盔检测系统。在自建的电动车头盔佩戴数据集上，当交并比（IoU）为0.5时，所提算法的平均精度均值（mAP）比原始YOLOv5s提升了7.1个百分点，召回率（Recall）提升了1.6个百分点。实验结果表明，所提改进的YOLOv5s算法更能满足在实际情况中对电动车及驾驶员头盔的检测精度要求，一定程度上降低了电动车交通事故的发生率。

关键词: 电动车头盔检测, YOLOv5s, 注意力机制, 非极大值抑制, 多尺度特征检测

CLC Number:

TP391.41

Zhouhua ZHU, Qi QI. Automatic detection and recognition of electric vehicle helmet based on improved YOLOv5s[J]. Journal of Computer Applications, 2023, 43(4): 1291-1296.

朱周华, 齐琦. 基于改进YOLOv5s电动车头盔的自动检测与识别[J]. 《计算机应用》唯一官方网站, 2023, 43(4): 1291-1296.

Figures/Tables 13

Fig. 1 Overall design block diagram of detection system

Fig. 2 Improved YOLOv5s network structure

Fig. 3 CBAM structure

Fig. 4 CA module structure

Fig. 5 Comparison of detection effects before and after adding attention module

Fig. 6 Comparison of detection effects before and after improving non-maximum suppression

Fig. 7 Comparison of effects before and after introducing multi-scale feature detection

Fig. 8 Detection system interface

Fig. 9 Some visualization pictures in dataset

Tab. 1 Performance comparison of YOLOv5s before and after improvement

类别	精度		召回率		mAP_0.5
类别	改进前	改进后	改进前	改进后	改进前	改进后
平均值	79.3	88.2	83.7	85.3	84.2	91.3
electric	84.2	91.6	91.5	91.5	93.1	96.9
helmet	86.2	90.5	88.5	92.6	91.1	96.2
no helmet	67.5	82.4	71.1	71.8	68.4	80.8

Tab. 2 Performance comparison of different algorithms

算法	精度	召回率	mAP_0.5
文献［14］算法	96.0	73.0	85.0
本文算法	88.2	86.4	93.1

Tab. 3 YOLOv5s ablation experiment results

模型	精度	召回率	mAP_0.5
原始YOLOv5s	79.3	83.7	84.2
YOLOv5s+注意力机制	81.6	84.5	86.8
YOLOv5s+DIoU-NMS	80.1	84.1	86.4
YOLOv5s+多尺度特征融合	80.8	83.9	87.6

Tab. 4 Performance comparison between the proposed algorithm and YOLOv5 series algorithms

算法	精度	召回率	mAP_0.5
YOLOv5s	79.3	83.7	84.2
YOLOv5m	83.6	85.6	88.2
YOLOv5l	84.8	86.1	90.1
YOLOv5x	84.1	85.5	89.4
改进YOLOv5s	88.2	86.4	91.3

References 30

1	REN S Q， HE K M， GIRSHICK R， et al. Faster R-CNN： towards real-time object detection with region proposal networks［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2017， 39（6）：1137-1149. 10.1109/tpami.2016.2577031
2	HE K M， GKIOXARI G， DOLLÁR P， et al. Mask R-CNN［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 2980-2988. 10.1109/iccv.2017.322
3	REDMON J， DIVVALA S， GIRSHICK R， et al. You only look once： unified real-time object detection［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016：779-788. 10.1109/cvpr.2016.91
4	LIU W， ANGUELOV D， ERHAN D， et al. SSD： single shot multibox detector［C］// Proceedings of the 2014 European Conference on Computer Vision， LNCS 9905. Cham： Springer， 2016：21-37.
5	LIN T Y， GOYAL P， GIRSHICK R， et al. Focal loss for dense object detection［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 2999-3007. 10.1109/iccv.2017.324
6	ZHANG S， CHI C， YAO Y， et al. Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020：9756-9765. 10.1109/cvpr42600.2020.00978
7	YANG Z， LIU S， HU H， et al. RepPoints： point set representation for object detection［C］// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2019：9656-9665. 10.1109/iccv.2019.00975
8	TIAN Z， SHEN C H， CHEN H， et al. FCOS： fully convolutional one-stage object detection［C］// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2019：9626-9635. 10.1109/iccv.2019.00972
9	BHAGAT S， CONTRACTOR D， SHARMA S， et al. Cascade classifier based helmet detection using OpenCV in image processing［C/OL］// Proceedings of the 2016 National Conference on Recent Trends in Computer and Communication Technology. ［2021-02-28］..
10	SILVA R， AIRES K， SANTOS T， et al. Automatic detection of motorcyclists without helmet［C］// Proceedings of the XXXIX Latin American Computing Conference. Piscataway： IEEE， 2013：1-7. 10.1109/clei.2013.6670613
11	YOGAMEENA B， MENAKA K， PERUMAAL S S. Deep learning-based helmet wear analysis of a motorcycle rider for intelligent surveillance system［J］. IET Intelligent Transport Systems， 2019， 13（7）：1190-1198. 10.1049/iet-its.2018.5241
12	VISHNU C， SINGH D， MOHAN C K， et al. Detection of motorcyclists without helmet in videos using convolutional neural network［C］// Proceedings of the 2017 International Joint Conference on Neural Networks. Piscataway： IEEE， 2017：3036-3041. 10.1109/ijcnn.2017.7966233
13	SHINE L， JIJI C V. Automated detection of helmet on motorcyclists from traffic surveillance videos： a comparative analysis using hand-crafted features and CNN［J］. Multimedia Tools Applications， 2020， 79（19/20）： 14179-14199. 10.1007/s11042-020-08627-w
14	CHAIRAT A， DAILEY M N， LIMSOONTHRAKUL S， et al. Low cost， high performance automatic motorcycle helmet violation detection［C］// Proceedings of the 2020 IEEE Winter Conference on Applications of Computer Vision. Piscataway： IEEE， 2020：3549-3557. 10.1109/wacv45572.2020.9093538
15	SINGH D， VISHNU C， MOHAN C K. Real-time detection of motorcyclist without helmet using cascade of CNNs on edge-device［C］// Proceedings of the IEEE 23rd International Conference on Intelligent Transportation Systems. Piscataway： IEEE， 2020：1-8. 10.1109/itsc45102.2020.9294747
16	DASGUPTA M， BANDYOPADHYAY O， CHATTERJI S. Automated helmet detection for multiple motorcycle riders using CNN［C］// Proceedings of the 2019 IEEE Conference on Information and Communication Technology. Piscataway： IEEE， 2019：1-4. 10.1109/cict48419.2019.9066191
17	赵睿，刘辉，刘沛霖，等. 基于改进YOLOv5s的安全帽检测算法［J/OL］. 北京航空航天大学学报（2021-11-23）［2022-01-23］.. 10.1109/icccas55266.2022.9825037
	ZHAO R， LIU H， LIU P L， et al. Research on helmet detection algorithm based on improved YOLOv 5s［J/OL］. Journal of Beijing University of Aeronautics and Astronautics （2021-11-23）［2022-01-23］.. 10.1109/icccas55266.2022.9825037
18	HE K M， ZHANG X Y， REN S Q， et al. Spatial pyramid pooling in deep convolutional networks for visual recognition［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2015， 37（9）：1904-1916. 10.1109/tpami.2015.2389824
19	LIN T Y， DOLLÁR P， GIRSHICK R， et al. Feature pyramid networks for object detection［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 936-944. 10.1109/cvpr.2017.106
20	LIU S， QI L， QIN H F， et al. Path aggregation network for instance segmentation［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018：8759-8768. 10.1109/cvpr.2018.00913
21	WOO S， PARK J， LEE J Y， et al. CBAM： convolutional block attention module［C］// Proceedings of the 2018 European Conference on Computer Vision， LNCS 11211. Cham： Springer， 2018： 3-19.
22	HOU Q B， ZHOU D Q， FENG J S. Coordinate attention for efficient mobile network design［C］// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2021：13708-13717. 10.1109/cvpr46437.2021.01350
23	邹梓吟，盖绍彦，达飞鹏，等. 基于注意力机制的遮挡行人检测算法［J］. 光学学报， 2021， 41（15）： No.1515001. 10.3788/aos202141.1515001
	ZOU Z Y， GAI S Y， DA F P， et al. Occluded pedestrian detection algorithm based on attention mechanism［J］. Acta Optica Sinica， 2021， 41（15）： No.1515001. 10.3788/aos202141.1515001
24	WU W K， ZHANG Y， WANG D， et al. SK-Net： deep learning on point cloud via end-to-end discovery of spatial keypoints［C］// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto， CA： AAAI Press， 2020： 6422-6429. 10.1609/aaai.v34i04.6113
25	ZHANG Z H， WANG P， LIU W， et al. Distance-IoU loss： faster and better learning for bounding box regression［C］// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto， CA： AAAI Press， 2020：12993-13000. 10.1609/aaai.v34i07.6999
26	LI S S， LI Y J， LI Y， et al. YOLO-FIRI： improved YOLOv5 for infrared image object detection［J］. IEEE Access， 2021， 9： 141861-141875. 10.1109/access.2021.3120870
27	IOFFE S， SZEGEDY C. Batch normalization： accelerating deep network training by reducing internal covariate shift［C］// Proceedings of the 32nd International Conference on Machine Learning. New York： JMLR.org， 2015：448-456.
28	樊缤，李智，高健. 基于多尺度知识学习的深度鲁棒水印算法［J］. 计算机应用， 2022， 42（10）：3102-3110. 10.11772/j.issn.1001-9081.2021050737
	FAN B， LI Z， GAO J. Deep robust watermarking algorithm based on multiscale knowledge learning［J］. Journal of Computer Applications， 2022， 42（10）：3102-3110. 10.11772/j.issn.1001-9081.2021050737
29	姚群力，胡显，雷宏. 基于多尺度融合特征卷积神经网络的遥感图像飞机目标检测［J］. 测绘学报， 2019， 48（10）：1266-1274. 10.11947/j.AGCS.2019.20180398
	YAO Q L， HU X， LEI H. Aircraft detection in remote sensing imagery with multi-scale feature fusion convolutional neural networks［J］. Acta Geodaetica et Cartographica Sinica， 2019， 48（10）：1266-1274. 10.11947/j.AGCS.2019.20180398
30	左航旭，廖彬，陈小昆，等. 融合迁移学习和数据增强的SC-Net模型在皮肤癌识别中的应用［J］. 计算机应用研究， 2022， 39（8）：2550-2555， 2560.
	ZUO H X， LIAO B， CHEN X K， et al. Application of SC-Net model integrated with transfer learning and data augmentation in skin cancer recognition［J］. Application Research of Computers， 2022， 39（8）：2550-2555， 2560.

[1]	Jing QIN, Zhiguang QIN, Fali LI, Yueheng PENG. Diagnosis of major depressive disorder based on probabilistic sparse self-attention neural network [J]. Journal of Computer Applications, 2024, 44(9): 2970-2974.
[2]	Liting LI, Bei HUA, Ruozhou HE, Kuang XU. Multivariate time series prediction model based on decoupled attention mechanism [J]. Journal of Computer Applications, 2024, 44(9): 2732-2738.
[3]	Zhiqiang ZHAO, Peihong MA, Xinhong HEI. Crowd counting method based on dual attention mechanism [J]. Journal of Computer Applications, 2024, 44(9): 2886-2892.
[4]	Kaipeng XUE, Tao XU, Chunjie LIAO. Multimodal sentiment analysis network with self-supervision and multi-layer cross attention [J]. Journal of Computer Applications, 2024, 44(8): 2387-2392.
[5]	Pengqi GAO, Heming HUANG, Yonghong FAN. Fusion of coordinate and multi-head attention mechanisms for interactive speech emotion recognition [J]. Journal of Computer Applications, 2024, 44(8): 2400-2406.
[6]	Zhonghua LI, Yunqi BAI, Xuejin WANG, Leilei HUANG, Chujun LIN, Shiyu LIAO. Low illumination face detection based on image enhancement [J]. Journal of Computer Applications, 2024, 44(8): 2588-2594.
[7]	Shangbin MO, Wenjun WANG, Ling DONG, Shengxiang GAO, Zhengtao YU. Single-channel speech enhancement based on multi-channel information aggregation and collaborative decoding [J]. Journal of Computer Applications, 2024, 44(8): 2611-2617.
[8]	Li LIU, Haijin HOU, Anhong WANG, Tao ZHANG. Generative data hiding algorithm based on multi-scale attention [J]. Journal of Computer Applications, 2024, 44(7): 2102-2109.
[9]	Song XU, Wenbo ZHANG, Yifan WANG. Lightweight video salient object detection network based on spatiotemporal information [J]. Journal of Computer Applications, 2024, 44(7): 2192-2199.
[10]	Dahai LI, Zhonghua WANG, Zhendong WANG. Dual-branch low-light image enhancement network combining spatial and frequency domain information [J]. Journal of Computer Applications, 2024, 44(7): 2175-2182.
[11]	Wenliang WEI, Yangping WANG, Biao YUE, Anzheng WANG, Zhe ZHANG. Deep learning model for infrared and visible image fusion based on illumination weight allocation and attention [J]. Journal of Computer Applications, 2024, 44(7): 2183-2191.
[12]	Wu XIONG, Congjun CAO, Xuefang SONG, Yunlong SHAO, Xusheng WANG. Handwriting identification method based on multi-scale mixed domain attention mechanism [J]. Journal of Computer Applications, 2024, 44(7): 2225-2232.
[13]	Huanhuan LI, Tianqiang HUANG, Xuemei DING, Haifeng LUO, Liqing HUANG. Public traffic demand prediction based on multi-scale spatial-temporal graph convolutional network [J]. Journal of Computer Applications, 2024, 44(7): 2065-2072.
[14]	Dianhui MAO, Xuebo LI, Junling LIU, Denghui ZHANG, Wenjing YAN. Chinese entity and relation extraction model based on parallel heterogeneous graph and sequential attention mechanism [J]. Journal of Computer Applications, 2024, 44(7): 2018-2025.
[15]	Xiaolu WANG, Wangfei QIAN. Gait recognition method based on two-branch convolutional network [J]. Journal of Computer Applications, 2024, 44(6): 1965-1971.