计算机应用 ›› 2020, Vol. 40 ›› Issue (8): 2472-2478.DOI: 10.11772/j.issn.1001-9081.2020010062

• 应用前沿、交叉与综合 • 上一篇    

基于YOLO v3算法改进的交通标志识别算法

江金洪1,2, 鲍胜利1,2, 史文旭1,2, 韦振坤1,2   

  1. 1. 中国科学院 成都计算机应用研究所, 成都 610041;
    2. 中国科学院大学, 北京 100049
  • 收稿日期:2020-02-04 修回日期:2020-03-31 出版日期:2020-08-10 发布日期:2020-03-31
  • 通讯作者: 江金洪(1994-),男,四川绵阳人,硕士研究生,主要研究方向:深度学习、数据挖掘,1127515524@qq.com
  • 作者简介:鲍胜利(1973-),男,安徽黄山人,研究员级高级工程师,博士研究生,主要研究方向:智能信息处理、深度学习;史文旭(1995-),男,河南焦作人,硕士研究生,主要研究方向:深度学习、智能算法;韦振坤(1995-),男,安徽阜阳人,博士研究生,主要研究方向:强化学习、机器学习。
  • 基金资助:
    四川省科技厅重点研发项目(2018SZ0040);四川省新一代人工智能重大专项(2018GZDZX0036)。

Improved traffic sign recognition algorithm based on YOLO v3 algorithm

JIANG Jinhong1,2, BAO Shengli1,2, SHI Wenxu1,2, WEI Zhenkun1,2   

  1. 1. Chengdu Institute of Computer Application, Chinese Academy of Sciences, Chengdu Sichuan 610041, China;
    2. University of Chinese Academy of Sciences, Beijing 100049, China
  • Received:2020-02-04 Revised:2020-03-31 Online:2020-08-10 Published:2020-03-31
  • Supported by:
    This work is partially supported by the Key Research and Development Program of Science and Technology Commission of Sichuan Province (2018SZ0040), the Major Project of New Generation Artificial Intelligence in Sichuan Province (2018GZDZX0036).

摘要: 针对目前交通标志识别任务在使用深度学习算法时存在模型参数量大、实时性较差和准确率较低的问题,提出了基于YOLO v3改进的交通标志识别算法。该算法首先将深度可分离卷积引入YOLO v3算法的特征提取层,将卷积过程分解为深度卷积、逐点卷积两部分,实现通道内卷积与通道间卷积之间的分离,从而保证了在较高识别准确率的基础上极大地减少了算法模型参数数量以及计算量。其次,在损失函数设计上使用广义交并比(GIoU)损失替换均方误差(MSE)损失,将评测标准量化为损失,解决了MSE损失存在的优化不一致和尺度敏感的问题,同时将Focal损失加入到损失函数以解决正负样本严重不均衡的问题,通过降低大量简单背景类的权重使得算法更专注于检测前景类。将该算法应用于交通标志任务中的结果表明,在TT100K数据集上,该算法的平均精度均值(mAP)指标达到了89%,相较于YOLO v3算法提升了6.6个百分点,且其参数量仅为原始YOLO v3算法的1/5左右,每秒帧数(FPS)亦比YOLO v3算法提升了60%。该算法在极大地减少模型参数量和计算量的同时,提高了检测速度和检测精度。

关键词: 交通标志识别, YOLO v3算法, 广义交并比, 深度可分离卷积, 损失函数, Focal损失

Abstract: Concerning the problems of large number of parameters, poor real-time performance and low accuracy of traffic sign recognition algorithms based on deep learning, an improved traffic sign recognition algorithm based on YOLO v3 was proposed. First, the depthwise separable convolution was introduced into the feature extraction layer of YOLO v3, as a result, the convolution process was decomposed into depthwise convolution and pointwise convolution to separate intra-channel convolution and inter-channel convolution, thus greatly reducing the number of parameters and the calculation of the algorithm while ensuring a high accuracy. Second, the Mean Square Error (MSE) loss was replaced by the GIoU (Generalized Intersection over Union) loss, which quantified the evaluation criteria as a loss. As a result, the problems of MSE loss such as optimization inconsistency and scale sensitivity were solved. At the same time, the Focal loss was also added to the loss function to solve the problem of severe imbalance between positive and negative samples. By reducing the weight of simple background classes, the new algorithm was more likely to focus on detecting foreground classes. The results of applying the new algorithm to the traffic sign recognition task show that, on the TT100K (Tsinghua-Tencent 100K) dataset, the mean Average Precision (mAP) of the algorithm reaches 89%, which is 6.6 percentage points higher than that of the YOLO v3 algorithm; the number of parameters is only about 1/5 of the original YOLO v3 algorithm, and the Frames Per Second (FPS) is 60% higher than YOLO v3 algorithm. The proposed algorithm improves detection speed and accuracy while reducing the number of model parameters and calculation.

Key words: traffic sign recognition, YOLO v3 algorithm, Generalized Intersection over Union (GIoU), depthwise separable convolution, loss function, Focal loss

中图分类号: