《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (2): 608-614.DOI: 10.11772/j.issn.1001-9081.2022010100

• 多媒体计算与计算机仿真 • 上一篇    

基于坐标注意力的轻量级交通标志识别模型

李文举, 张干, 崔柳(), 储王慧   

  1. 上海应用技术大学 计算机科学与信息工程学院,上海 201418
  • 收稿日期:2022-01-25 修回日期:2022-04-25 接受日期:2022-04-26 发布日期:2022-05-31 出版日期:2023-02-10
  • 通讯作者: 崔柳
  • 作者简介:李文举(1964—),男,辽宁营口人,教授,博士,CCF会员,主要研究方向:计算机视觉、模式识别、智能检测
    张干(1997—),男,江苏徐州人,硕士研究生,主要研究方向:目标检测、交通标志识别
    储王慧(1998—),女,安徽池州人,硕士研究生,主要研究方向:图神经网络、3D目标检测。
  • 基金资助:
    国家自然科学基金资助项目(61903256)

Lightweight traffic sign recognition model based on coordinate attention

Wenju LI, Gan ZHANG, Liu CUI(), Wanghui CHU   

  1. School of Computer Science and Information Engineering,Shanghai Institute of Technology,Shanghai 201418,China
  • Received:2022-01-25 Revised:2022-04-25 Accepted:2022-04-26 Online:2022-05-31 Published:2023-02-10
  • Contact: Liu CUI
  • About author:LI Wenju, born in 1964, Ph. D., professor. His research interests include computer vision, pattern recognition, intelligent detection.
    ZHANG Gan, born in 1997, M. S. candidate. His research interests include object detection, traffic sign recognition.
    CHU Wanghui, born in 1998, M. S. candidate. Her research interests include graph neural network, 3D object detection.
  • Supported by:
    National Natural Science Foundation of China(61903256)

摘要:

针对交通标志识别模型检测速度与识别精度不均衡,以及受遮挡目标和小目标难以检测的问题,对YOLOv5模型进行改进,提出一种基于坐标注意力(CA)的轻量级交通标志识别模型。首先,通过在主干网络中融入CA机制,有效地捕获位置信息和通道之间的关系,从而更准确地获取感兴趣区域,避免过多的计算开销;然后,通过在特征融合网络中加入跨层连接,在不增加成本的情况下融合更多的特征信息,提高网络的特征提取能力,并改善对遮挡目标的检测效果;最后,引入改进的CIoU函数计算定位损失,以缓解检测过程中样本尺寸分布不均衡的现象,并进一步提高对小目标的识别精度。在TT100K数据集上应用所提模型时,识别精度达到了91.5%,召回率达到了86.64%,与传统的YOLOv5n模型相比分别提高了20.96%和11.62%,且帧处理速率达到了140.84 FPS。实验结果比较充分地验证了所提模型在真实场景中对交通标志检测与识别的准确性与实时性。

关键词: YOLOv5, 交通标志识别, 坐标注意力, 特征融合, 损失函数

Abstract:

For the problems of unbalanced detection speed and recognition accuracy of traffic sign recognition models, and that it is difficult to detect occluded targets and small targets, YOLOv5 (You Only Look Once version 5) model was improved, and a lightweight traffic sign recognition model based on Coordinate Attention (CA) was proposed. Firstly, CA mechanism was integrated into the backbone network to effectively capture the relationships between location information and channels, so as to obtain the regions of interest more accurately and avoid too much computational overhead. Then, cross layer connections were added to the feature fusion network to fuse more feature information without increasing the cost, improve the feature extraction ability of the network and the detection effect of occluded targets. Finally, the improved CIoU (Complete Intersection over Union) function was introduced to calculate the localization loss, thereby alleviating the uneven distribution of sample size in the detection process, and further improving the recognition accuracy of small targets. Applying this model on TT100K (Tsinghua-Tencent 100K) dataset, the recognition accuracy is 91.5%, the recall is 86.64%, which are improved by 20.96% and 11.62% respectively compared with those of the traditional YOLOv5n model, and the frame processing rate is 140.84 FPS (Frames Per Second). These experimental results fully verify the accuracy and real-time performance of the proposed model for traffic sign detection and recognition in real scenes.

Key words: YOLOv5 (You Only Look Once version 5), traffic sign recognition, Coordinate Attention (CA), feature fusion, loss function

中图分类号: