《计算机应用》唯一官方网站

• •    下一篇

基于改进YOLOv5Logo检测算法

李烨恒,罗光圣,苏前敏   

  1. 上海工程技术大学
  • 收稿日期:2023-08-17 修回日期:2023-10-23 发布日期:2023-12-18 出版日期:2023-12-18
  • 通讯作者: 罗光圣
  • 基金资助:
    科技部科技创新2030“新一代人工智能”重大项目“工业领域知识自动构建与推理决策技术及应用

Logo detection based on improved YOLOv5

  • Received:2023-08-17 Revised:2023-10-23 Online:2023-12-18 Published:2023-12-18

摘要: 近年来,Logo检测和识别在计算机领域不断创新和快速发展。但针对图像背景复杂、Logo目标尺寸多变等问题仍然是Logo检测领域正在面对的痛点。针对以上问题,提出了一种基于Yolov5的改进网络模型。结合CBAM注意力机制,分别对图像通道与空间方向进行压缩,提取出图像中的关键信息与重要区域。使用SAC使得网络在不同尺度下自适应地调整特征图中的感受野大小,以捕获不同尺度下的物体信息,增加网络对多尺度目标的检测效果。将NWD度量嵌入损失函数,减小网络对小目标尺度的敏感性,提高模型鲁棒性和稳定性。在公开的FlickrLogos-32和QMULOpenLogo数据集上进行试验,结果表明在数据量较小的FlickrLogos-32数据集中map相比于基准Yolov5提升了1%;在数据量较大的QMULOpenLogo数据集中map达到62.7%,提升了2.3%。检测效果优于原网络模型与经典目标检测网络模型。

关键词: Logo检测, Yolov5网络模型, 注意力机制, 小目标检测, 深度学习

Abstract: In recent years, logo detection and recognition have witnessed continuous innovation and rapid development in the field of computer science. However, challenges such as complex image backgrounds and varying sizes of logo targets still remain in the domain of logo detection. To address these issues, a modified network model based on Yolov5 is proposed. By integrating the CBAM attention mechanism, compression is applied to both the image channels and spatial directions, enabling the extraction of crucial information and important regions within the image. The use of SAC allows the network to adaptively adjust the receptive field size in feature maps at different scales, facilitating the detection of objects across multiple scales and enhancing the network's effectiveness. The NWD metric is embedded into the loss function to reduce the sensitivity of the network to small target sizes, thereby improving model robustness and stability. Experimental evaluations on the publicly available FlickrLogos-32 and QMULOpenLogo datasets demonstrate promising results. In the smaller FlickrLogos-32 dataset, the proposed model achieved a 1% im-provement in mAP compared to the baseline Yolov5. On the larger QMULOpenLogo dataset, the mAP reached 62.7%, showing a 2.3% enhancement. The detection performance surpasses that of both the original network model and classical object detection models.

Key words: logo detection, yolov5 network model, attention module, small object detection, deep learning

中图分类号: