基于改进YOLOv5的Logo检测算法

doi:10.11772/j.issn.1001-9081.2023081113

《计算机应用》唯一官方网站

• • 下一篇

基于改进YOLOv5的Logo检测算法

李烨恒,罗光圣,苏前敏

上海工程技术大学

收稿日期:2023-08-17 修回日期:2023-10-23 发布日期:2023-12-18 出版日期:2023-12-18
通讯作者: 罗光圣
基金资助:
科技部科技创新2030“新一代人工智能”重大项目“工业领域知识自动构建与推理决策技术及应用

Logo detection based on improved YOLOv5

Received:2023-08-17 Revised:2023-10-23 Online:2023-12-18 Published:2023-12-18

摘要/Abstract

摘要： 近年来，Logo检测和识别在计算机领域不断创新和快速发展。但针对图像背景复杂、Logo目标尺寸多变等问题仍然是Logo检测领域正在面对的痛点。针对以上问题，提出了一种基于Yolov5的改进网络模型。结合CBAM注意力机制，分别对图像通道与空间方向进行压缩，提取出图像中的关键信息与重要区域。使用SAC使得网络在不同尺度下自适应地调整特征图中的感受野大小，以捕获不同尺度下的物体信息，增加网络对多尺度目标的检测效果。将NWD度量嵌入损失函数，减小网络对小目标尺度的敏感性，提高模型鲁棒性和稳定性。在公开的FlickrLogos-32和QMULOpenLogo数据集上进行试验，结果表明在数据量较小的FlickrLogos-32数据集中map相比于基准Yolov5提升了1%；在数据量较大的QMULOpenLogo数据集中map达到62.7%，提升了2.3%。检测效果优于原网络模型与经典目标检测网络模型。

关键词: Logo检测, Yolov5网络模型, 注意力机制, 小目标检测, 深度学习

Abstract: In recent years, logo detection and recognition have witnessed continuous innovation and rapid development in the field of computer science. However, challenges such as complex image backgrounds and varying sizes of logo targets still remain in the domain of logo detection. To address these issues, a modified network model based on Yolov5 is proposed. By integrating the CBAM attention mechanism, compression is applied to both the image channels and spatial directions, enabling the extraction of crucial information and important regions within the image. The use of SAC allows the network to adaptively adjust the receptive field size in feature maps at different scales, facilitating the detection of objects across multiple scales and enhancing the network's effectiveness. The NWD metric is embedded into the loss function to reduce the sensitivity of the network to small target sizes, thereby improving model robustness and stability. Experimental evaluations on the publicly available FlickrLogos-32 and QMULOpenLogo datasets demonstrate promising results. In the smaller FlickrLogos-32 dataset, the proposed model achieved a 1% im-provement in mAP compared to the baseline Yolov5. On the larger QMULOpenLogo dataset, the mAP reached 62.7%, showing a 2.3% enhancement. The detection performance surpasses that of both the original network model and classical object detection models.

Key words: logo detection, yolov5 network model, attention module, small object detection, deep learning

中图分类号:

TP399

李烨恒罗光圣苏前敏. 基于改进YOLOv5的Logo检测算法[J]. 计算机应用, DOI: 10.11772/j.issn.1001-9081.2023081113.

[1]	蔡美玉, 朱润哲, 吴飞, 张开昱, 李家乐. 基于注意力机制和多粒度特征融合的跨视角匹配模型[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 901-908.
[2]	董永峰, 白佳明, 王利琴, 王旭. 融合先验知识和字形特征的中文命名实体识别[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 702-708.
[3]	徐大鹏, 侯新民. 基于网络结构设计的图神经网络特征选择方法[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 663-670.
[4]	尚爱国, 朱欣娟. 基于多任务学习的意图检测和槽位填充联合方法[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 690-695.
[5]	郑宇亮, 陈云华, 白伟杰, 陈平华. 融合事件数据和图像帧的车辆目标检测[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 931-937.
[6]	董炜娜, 刘佳, 潘晓中, 陈立峰, 孙文权. 基于编码-解码网络的大容量鲁棒图像隐写方案[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 772-779.
[7]	赵奎, 仇慧琪, 李旭, 徐知非. 结合注意力和多路径融合的实时肺结节检测算法[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 945-952.
[8]	孙滔, 段张甜, 朱浩楠, 郭沛豪, 孙鹤立. 基于新奇度量的社交事件推荐方法[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 760-766.
[9]	黄子杰, 欧阳, 江德港, 郭彩玲, 李柏林. 面向牵引座焊缝表面质量检测的轻量型深度学习算法[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 983-988.
[10]	李雨秋, 侯利萍, 薛健, 吕科, 王泳. 基于内容解译的遥感图像推荐方法[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 722-731.
[11]	江锐, 刘威, 陈成, 卢涛. 非对称端到端的无监督图像去雨网络[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 922-930.
[12]	唐瑶瑶, 朱叶晨, 刘仰川, 高欣. CT图像环形伪影去除方法研究现状及展望[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 890-900.
[13]	罗歆然, 李天瑞, 贾真. 基于自注意力机制与词汇增强的中文医学命名实体识别[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 385-392.
[14]	邓辅秦, 官桧锋, 谭朝恩, 付兰慧, 王宏民, 林天麟, 张建民. 基于请求与应答通信机制和局部注意力机制的多机器人强化学习路径规划方法[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 432-438.
[15]	荆智文, 张屿佳, 孙伯廷, 郭浩. 二阶段孪生图卷积神经网络推荐算法[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 469-476.

基于改进YOLOv5的Logo检测算法

Logo detection based on improved YOLOv5

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics