Improved traffic sign recognition algorithm based on YOLO v3 algorithm
JIANG Jinhong1,2, BAO Shengli1,2, SHI Wenxu1,2, WEI Zhenkun1,2
1. Chengdu Institute of Computer Application, Chinese Academy of Sciences, Chengdu Sichuan 610041, China; 2. University of Chinese Academy of Sciences, Beijing 100049, China
Abstract:Concerning the problems of large number of parameters, poor real-time performance and low accuracy of traffic sign recognition algorithms based on deep learning, an improved traffic sign recognition algorithm based on YOLO v3 was proposed. First, the depthwise separable convolution was introduced into the feature extraction layer of YOLO v3, as a result, the convolution process was decomposed into depthwise convolution and pointwise convolution to separate intra-channel convolution and inter-channel convolution, thus greatly reducing the number of parameters and the calculation of the algorithm while ensuring a high accuracy. Second, the Mean Square Error (MSE) loss was replaced by the GIoU (Generalized Intersection over Union) loss, which quantified the evaluation criteria as a loss. As a result, the problems of MSE loss such as optimization inconsistency and scale sensitivity were solved. At the same time, the Focal loss was also added to the loss function to solve the problem of severe imbalance between positive and negative samples. By reducing the weight of simple background classes, the new algorithm was more likely to focus on detecting foreground classes. The results of applying the new algorithm to the traffic sign recognition task show that, on the TT100K (Tsinghua-Tencent 100K) dataset, the mean Average Precision (mAP) of the algorithm reaches 89%, which is 6.6 percentage points higher than that of the YOLO v3 algorithm; the number of parameters is only about 1/5 of the original YOLO v3 algorithm, and the Frames Per Second (FPS) is 60% higher than YOLO v3 algorithm. The proposed algorithm improves detection speed and accuracy while reducing the number of model parameters and calculation.
[1] 于硕. 交通标志识别技术综述[J]. 科技资讯, 2019, 17(6):15-16. (YU S. Overview of traffic sign recognition technology[J]. Science and Technology Information, 2019, 17(6):15-16.) [2] FLEYEH H, BISWAS R, DAVAMI E. Traffic sign detection based on AdaBoost color segmentation and SVM classification[C]//Proceedings of the 2013 Eurocon. Piscataway:IEEE, 2013:2005-2010. [3] CREUSEN I M, WIJNHOVEN R G J, HERBSCHLEB E, et al. Color exploitation in hog-based traffic sign detection[C]//Proceedings of the 2010 IEEE International Conference on Image Processing. Piscataway:IEEE, 2010:2669-2672. [4] 杜影丽,贾永红,韩静敏. 自然场景车载视频道路交通限速标志的检测与识别方法[J]. 测绘地理信息, 2018, 43(2):32-34, 37. (DU Y L, JIA Y H, HAN J M. A detection and recognition method for traffic speed limit signs based on vehicle videos[J]. Journal of Geomatics, 2018, 43(2):32-34, 37.) [5] 李志军,崔利娟. 基于深度森林的交通标志识别方法研究[J]. 工业控制计算机, 2019, 32(5):114-115, 120. (LI Z J, CUI L J. Research on traffic sign recognition algorithm based on deep forest[J]. Industrial Control Computer, 2019, 32(5):114-115, 120.) [6] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[C]//Proceedings of the 25th International Conference on Neural Information Processing Systems. Cambridge, MA:MIT Press, 2012, 1:1097-1105. [7] RUSSAKOVSKY O, DENG J, SU H, et al. ImageNet large scale visual recognition challenge[J]. International Journal of Computer Vision, 2015, 115(3):211-252. [8] REN S, HE K, GIRSHICK R, et al. Faster R-CNN:towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6):1137-1149. [9] REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once:unified, real-time object detection[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2016:779-788. [10] LIU W, ANGUELOV D, ERHAN D, et al. SSD:single shot multibox detector[C]//Proceedings of the 2016 European Conference on Computer Vision, LNCS 9905. Cham:Springer, 2016:21-37. [11] SERMANET P, LECUN Y. Traffic sign recognition with multi-scale convolutional networks[C]//Proceedings of the 2011 International Joint Conference on Neural Networks. Piscataway:IEEE, 2011:2809-2813. [12] STALLKAMP J, SCHLIPSING M, SALMEN J, et al. Man vs. computer:benchmarking machine learning algorithms for traffic sign recognition[J]. Neural Networks, 2012, 32:323-332. [13] ZHU Z, LIANG D, ZHANG S, et al. Traffic-sign detection and classification in the wild[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2016:2110-2118. [14] WANG G, XIONG Z, LIU D, et al. Cascade mask generation framework for fast small object detection[C]//Proceedings of the 2018 IEEE International Conference on Multimedia and Expo. Piscataway:IEEE, 2018:1-6. [15] REDMON J, FARHADI A. YOLO v3:an incremental improvement[EB/OL].[2019-04-08].https://arxiv.org/pdf/1804.02767.pdf. [16] CHOLLET F. Xception:deep learning with depthwise separable convolutions[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2017:1800-1807. [17] REZATOFIGHI H, TSOI N, GWAK J, et al. Generalized intersection over union:a metric and a loss for bounding box regression[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2019:658-666. [18] LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(2):318-327. [19] REDMON J, FARHADI A. YOLO9000:better, faster, stronger[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2017:6517-6525. [20] HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2016:770-778. [21] LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2017:936-944.