基于改进YOLOv5的Logo检测算法

doi:10.11772/j.issn.1001-9081.2023081113

《计算机应用》唯一官方网站 ›› 2024, Vol. 44 ›› Issue (8): 2580-2587.DOI: 10.11772/j.issn.1001-9081.2023081113

• 多媒体计算与计算机仿真 • 上一篇下一篇

基于改进YOLOv5的Logo检测算法

李烨恒, 罗光圣(), 苏前敏

上海工程技术大学，电子电气工程学院，上海 201620

收稿日期:2023-08-18 修回日期:2023-10-23 接受日期:2023-11-02 发布日期:2023-12-18 出版日期:2024-08-10
通讯作者: 罗光圣
作者简介:李烨恒（1997—），男，湖北武汉人，硕士研究生，主要研究方向：深度学习、目标检测
罗光圣（1982—），男，湖北黄石人，副教授，博士，主要研究方向：小样本学习、联邦学习、目标检测 luoguangsheng03@126.com
苏前敏（1974—），男，上海人，副教授，博士，主要研究方向：深度学习、知识图谱、智慧医疗。
基金资助:
科技部科技创新2030“新一代人工智能”重大项目(2020AAA0109300)

Logo detection algorithm based on improved YOLOv5

Yeheng LI, Guangsheng LUO(), Qianmin SU

College of Electronic and Electrical Engineering，Shanghai University of Engineering Science，Shanghai 201620，China

Received:2023-08-18 Revised:2023-10-23 Accepted:2023-11-02 Online:2023-12-18 Published:2024-08-10
Contact: Guangsheng LUO
About author:bio graphy：LI Yeheng， born in 1997， M. S. candidate. His research interests include deep learning， object detection.
bio graphy：SU Qianmin， born in 1974， Ph. D.， associate professor. His research interests include deep learning， knowledge graph， smart healthcare.
Supported by:
Scientific and Technological Innovation 2030 — “New Generation Artificial Intelligence” Major Project(2020AAA0109300)

摘要/Abstract

摘要：

针对Logo图像背景复杂、Logo目标尺寸多变的问题，提出了一种基于YOLOv5的改进检测算法。首先，结合CBAM（Channel Block Attention Module），分别在图像通道与空间方向进行压缩，提取图像的关键信息与重要区域；然后，使用可变空洞卷积（SAC）使网络在不同尺度下自适应地调整特征图中的感受野大小，以捕获不同尺度下的物体信息，改善网络对多尺度目标的检测效果；最后，将归一化Wasserstein距离（NWD）嵌入损失函数，将边界框建模成2D的高斯分布，计算对应的高斯分布之间的相似度，更好地度量目标之间的相似性，提高对小目标的检测性能与模型鲁棒性和稳定性。实验结果表明，在数据量较小的数据集FlickrLogos-32中，改进后算法的平均精度均值（mAP@0.5）达到90.6%，比原始YOLOv5算法提升了1个百分点；在数据量较大的数据集QMULOpenLogo中，改进后算法的mAP@0.5达到62.7%，比原始YOLOv5算法提升了2.3个百分点；在针对特定类型的Logo检测集LogoDet3K中，针对3类商标改进后算法比原始算法的mAP@0.5分别提升了1.2、1.4与1.4个百分点，说明它有更好的Logo图像小目标检测能力。

关键词: Logo检测, YOLOv5网络模型, CBAM, 小目标检测, 归一化Wasserstein距离

Abstract:

To address the challenges posed by complex background and varying size of logo images， an improved detection algorithm based on YOLOv5 was proposed. Firstly， in combination with the Channel Block Attention Module （CBAM）， compression was applied in both image channels and spatial dimensions to extract critical information and significant regions within the image. Subsequently， the Switchable Atrous Convolution （SAC） was employed to allow the network to adaptively adjust the receptive field size in feature maps at different scales， improving the detection effects of objects across multiple scales. Finally， the Normalized Wasserstein Distance （NWD） was embedded into the loss function. The bounding boxes were modeled as 2D Gaussian distributions， the similarity between corresponding Gaussian distributions was calculated to better measure the similarity among objects， thereby enhancing the detection performance for small objects， and improving model robustness and stability. Compared to the original YOLOv5 algorithm： in small dataset FlickrLogos?32， the improved algorithm achieved a mean of Average Precision （mAP@0.5） of 90.6%， with an increase of 1 percentage point； in large dataset QMULOpenLogo， the improved algorithm achieved an mAP@0.5 of 62.7%， with an increase of 2.3 percentage points； in LogoDet3K for three types of logos， the improved algorithm increased the mAP@0.5 by 1.2， 1.4， and 1.4 percentage points respectively. Experimental results demonstrate that the improved algorithm has better small object detection ability of logo images.

Key words: Logo detection, YOLOv5 network model, Channel Block Attention Module (CBAM), small object detection, Normalized Wasserstein Distance (NWD)

中图分类号:

TP399

李烨恒, 罗光圣, 苏前敏. 基于改进YOLOv5的Logo检测算法[J]. 计算机应用, 2024, 44(8): 2580-2587.

Yeheng LI, Guangsheng LUO, Qianmin SU. Logo detection algorithm based on improved YOLOv5[J]. Journal of Computer Applications, 2024, 44(8): 2580-2587.

图/表 19

图1 YOLOv5网络改进架构

Fig. 1 Improved structure of YOLOv5 network

图2 CBAM注意力机制

Fig. 2 CBAM attention mechanism

图3 通道注意力机制流程

Fig. 3 Workflows of channel attention mechanism

图4 空间注意力机制流程

Fig. 4 Workflows of spatial attention mechanism

图5 CBAM与YOLOv5网络结合

Fig.5 Combination of CBAM with YOLOv5 network

图6 SAC流程

Fig. 6 Flow of SAC

图7 C3_SAC结构

Fig. 7 C3_SAC structure

表1 不同算法的时间和空间复杂度对比

Tab. 1 Comparison of time and space complexities among different algorithms

算法	时间复杂度	空间复杂度
YOLOv3	$O (H W C K 2 S U V)$	$O H W C G S 2 U 2$
YOLOv4	$O (5 H W C K 2 S U 2 V)$	$O 5 H W C G S 2 U 2$
YOLOv5	$O (8 H W C K 2 S U 2 V)$	$O 8 H W C G S 2 U 2$
本文算法	$O (8 H W C 2 K 2 S U 2 V)$	$O 8 H W C K 2 N G S 2 U 2$

表1 不同算法的时间和空间复杂度对比

Tab. 1 Comparison of time and space complexities among different algorithms

算法	时间复杂度	空间复杂度
YOLOv3	$O (H W C K 2 S U V)$	$O H W C G S 2 U 2$
YOLOv4	$O (5 H W C K 2 S U 2 V)$	$O 5 H W C G S 2 U 2$
YOLOv5	$O (8 H W C K 2 S U 2 V)$	$O 8 H W C G S 2 U 2$
本文算法	$O (8 H W C 2 K 2 S U 2 V)$	$O 8 H W C K 2 N G S 2 U 2$

表2 实验数据集信息统计

Tab. 2 Information statistics of experimental datasets

数据集名称	Logo种类数	图像数	目标数
FlickrLogos-32	32	2 240	3 450
QMULOpenLogo	352	27 083	51 207
LogoDet3K-Food	932	53 350	64 276
LogoDet3K-Clothes	604	31 266	37 601
LogoDet3K-Necessities	432	24 822	30 643

表3 不同模型在FlickrLogos-32上实验结果对比

Tab. 3 Comparison of experimental results of different models on FlickrLogos-32

模型	Backbone	mAP@0.5/%
Faster R-CNN	ResNet-50	83.50
SSD	VGG-16	80.20
Logo-YOLO	Darknet53	76.10
GFL	ResNet-50	86.00
Trinity-YOLO	Darknet53	85.38
YOLOv5	Darknet53	89.60
本文模型	Darknet53	90.60

表4 FlickrLogos-32上消融实验

Tab. 4 Ablation experiment results on FlickrLogos-32

CBAM	SE	SAC	NWD	mAP@0.5/%
				89.6
√				90.2
		√		89.9
√			√	90.3
√		√	√	90.6
	√			89.8
	√		√	90.1
	√	√	√	90.5

表5 不同模型在QMULOpenLogo上实验结果对比

Tab. 5 Comparison of experimental results of different models on QMULOpenLogo

模型	Backbone	mAP@0.5/%
Faster R-CNN	ResNet-50	51.7
SSD	VGG-16	42.0
GFL	ResNet-50	49.1
ATSS	ResNet-50	48.8
YOLOv5	Darknet53	60.4
本文模型	Darknet53	62.7

表6 QMULOpenLogo消融实验结果

Tab. 6 Ablation experiment results on QMULOpenLogo

CBAM	SE	SAC	NWD	mAP@0.5/%
				60.4
√				60.8
√		√		61.9
√			√	61.6
√		√	√	62.7
	√			59.8
	√	√		60.7
	√		√	61.4
	√	√	√	62.5

图8 QMULOpenLogo上精确率-召回率曲线对比

Fig. 8 Precision-Recall curve comparison on QMULOpenLogo

表7 LogoDet3K三类商标对比结果

Tab. 7 Comparison results of three types on LogoDet3K

数据集	算法	mAP@0.5/%
LogoDet3K-Food	YOLOv5	87.3
LogoDet3K-Food	本文算法	88.5
LogoDet3K-Clothes	YOLOv5	91.2
LogoDet3K-Clothes	本文算法	92.6
LogoDet3K-Necessities	YOLOv5	89.8
LogoDet3K-Necessities	本文算法	91.2

图9 LogoDet3K-Clothes检测结果对比

Fig. 9 Comparison of detection results on LogoDet3K-Clothes

图10 复杂背景识别结果对比

Fig. 10 Comparison of complex background recognition results

图11 物体遮挡检测结果对比

Fig. 11 Comparison of object occlusion detection results

图12 多目标检测结果对比

Fig. 12 Comparison of multi-object detection results

参考文献 36

1	韩贵金，胡仲阳，石海宾.基于YOLOv4与位置先验的快递三段码检测算法［J］.西安邮电大学学报，2021，26（4）：105-110.
	HAN G J， HU Z Y， SHI H B. Three-segment waybill code detection algorithm based on YOLOv4 and location prior［J］. Journal of Xi’an University of Posts and Telecommunications， 2021，26（4）：105-110.
2	JIANG X， SUN K， MA L， et al. Vehicle Logo detection method based on improved YOLOv4［J］. Electronics， 2022， 11（20）： 3400.
3	ISWARYA M， SHANKAR S A， HAMEED S A. Fake logo detection［C］// Proceedings of the 2022 1st International Conference on Computational Science and Technology. Piscataway： IEEE， 2022： 998-1001.
4	LINDEBERG T. Scale invariant feature transform［J］. Scholarpedia， 2012， 7（5）： 10491.
5	BAY H， TUYTELAARS T， VAN GOOL L. SURF： speeded up robust features［C］// Proceedings of the 9th European Conference on Computer Vision. Berlin： Springer， 2006： 404-417.
6	DALAL N， TRIGGS B. Histograms of oriented gradients for human detection［C］// Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2005， 1： 886-893.
7	HOU S， LI J， MIN W， et al. Deep learning for Logo detection： a survey［EB/OL］. （2022-10-10）［2023-08-01］. .
8	REN S， HE K， GIRSHICK R， et al. Faster R-CNN： towards real-time object detection with region proposal networks［C］// Proceedings of the 28th International Conference on Neural Information Processing Systems. Cambridge： MIT Press， 2015：91-99.
9	REDMON J， DIVVALA S， GIRSHICK R， et al. You only look once： unified， real-time object detection［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016：779-788.
10	REDMON J， FARHADI A. YOLO9000： better， faster， stronger［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 6517-6525.
11	REDMON J， FARHADI A. YOLOv3： an incremental improvement［EB/OL］. （2018-04-08）［2023-08-01］. .
12	BOCHKOVSKIY A， WANG C-Y， LIAO H-Y M. YOLOv4： optimal speed and accuracy of object detection［EB/OL］. （2020-04-23）［2023-08-01］..
13	ZHU X， LYU S， WANG X， et al. TPH-YOLOv5： improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios［C］// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2021： 2778-2788.
14	IANDOLA F N， SHEN A， GAO P， et al. DeepLogo： Hitting logo recognition with the deep neural network hammer［EB/OL］. （2015-10-07）［2023-08-01］..
15	江玉朝，吉立新，高超，等.基于卷积神经网络的多尺度Logo检测算法［J］.网络与信息安全学报，2020，6（2）：116-124.
	JIANG Y C， JI L X， GAO C， et al. Multi-scale Logo detection algorithm based on convolutional neural network［J］. Chinese Journal of Network and Information Security，2020，6（2）：116-124.
16	王林，范亚臣.结合坐标注意力与自适应残差连接的logo检测［J］.计算机系统应用，2022，31（5）：137-146.
	WANG L， FAN Y C. Logo detection combining coordinate attention and adaptive residual connection［J］. Computer Systems & Applications， 2022，31（5）： 137-146.
17	MAO K J， JIN R H， CHEN K Y， et al. Trinity-YOLO： high-precision logo detection in the real world［J］. IET Image Processing， 2023， 17（7）： 2272-2283.
18	WANG J， MIN W， HOU S， et al. LogoDet-3K： a large-scale image dataset for logo detection［J］. ACM Transactions on Multimedia Computing， Communications， and Applications， 2022， 18（1）： 22.
19	ZHANG B， HOU S， KARIM A， et al. Discriminative semantic feature pyramid network with guided anchoring for logo detection［J］. Mathematics， 2023， 11（2）： 481.
20	LI X， HOU S， ZHANG B， et al. Long-range dependence involutional network for logo detection［J］. Entropy， 2023， 25（1）： 174.
21	HOU S， LIU W， KARIM A， et al. Few-shot logo detection［J］. IET Computer Vision， 2023， 17（5）：586-598.
22	LI Y， XUE J， ZHANG M， et al. YOLOv5-ASFF： a multistage strawberry detection algorithm based on improved YOLOv5［J］. Agronomy， 2023， 13（7）： 1901.
23	MAHAUR B， MISHRA K K. Small-object detection based on YOLOv5 in autonomous driving systems［J］. Pattern Recognition Letters， 2023， 168： 115-122.
24	WOO S， PARK J， LEE J-Y， et al. CBAM： convolutional block attention module［C］// Proceedings of the 15th European Conference on Computer Vision. Cham： Springer， 2018： 3-19.
25	MA K， ZHAN C A， YANG F. Multi-classification of arrhythmias using ResNet with CBAM on CWGAN-GP augmented ECG Gramian Angular Summation Field［J］. Biomedical Signal Processing and Control， 2022， 77： 103684.
26	WANG L， CAO Y， WANG S， et al. Investigation into recognition algorithm of helmet violation based on YOLOv5-CBAM-DCN［J］. IEEE Access， 2022， 10： 60622-60632.
27	HU J， SHEN L， SUN G. Squeeze-and-excitation networks［C］ // Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 7132-7141.
28	ZHU X， CHENG D， ZHANG Z， et al. An empirical study of spatial attention mechanisms in deep networks［C］// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2019： 6687-6696.
29	QIAO S， CHEN L-C， YUILLE A. DetectoRS： detecting objects with recursive feature pyramid and switchable atrous convolution［C］// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2021： 10208-10219.
30	WANG J， XU C， YANG W， et al. A normalized Gaussian Wasserstein distance for tiny object detection［EB/OL］. （2021-10-26）［2023-08-01］. .
31	ROMBERG S， PUEYO L G， LIENHART R， et al. Scalable logo recognition in real-world images［C］// Proceedings of the 1st ACM International Conference on Multimedia Retrieval. New York： ACM， 2011： 25.
32	SU H， ZHU X， GONG S. Open logo detection challenge［EB/OL］. ［2023-08-01］. .
33	KUMAR A， ZHANG Z J， LYU H. Object detection in real time based on improved single shot multi-box detector algorithm［J］. EURASIP Journal on Wireless Communications and Networking， 2020， 2020： 1-18.
34	LI X， WANG W， WU L， et al. Generalized focal loss： learning qualified and distributed bounding boxes for dense object detection［J］. Advances in Neural Information Processing Systems， 2020， 33： 21002-21012.
35	MAO K J， JIN R H， CHEN K Y， et al. Trinity-YOLO： high-precision logo detection in the real world［J］. IET Image Processing， 2023， 17（7）： 2272-2283.
36	ZHANG S， CHI C， YAO Y， et al. Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 9759-9768.

[1]	姬张建, 杜娜. 基于改进VariFocalNet的微小目标检测[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2200-2207.
[2]	刘越, 刘芳, 武奥运, 柴秋月, 王天笑. 基于自注意力机制与图卷积的3D目标检测网络[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1972-1977.
[3]	刘涛, 鞠事宏, 高一萌. 基于改进YOLOv8n的无人机视角下小目标检测算法[J]. 《计算机应用》唯一官方网站, 2024, 44(11): 3603-3609.
[4]	王林, 刘景亮, 王无为. 基于空洞卷积融合Transformer的无人机图像小目标检测方法[J]. 《计算机应用》唯一官方网站, 2024, 44(11): 3595-3602.
[5]	梁美佳, 刘昕武, 胡晓鹏. 基于改进YOLOv3的列车运行环境图像小目标检测算法[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2611-2618.
[6]	吕宗喆, 徐慧, 杨骁, 王勇, 王唯鉴. 面向小目标的YOLOv5安全帽检测算法[J]. 《计算机应用》唯一官方网站, 2023, 43(6): 1943-1949.
[7]	秦强强, 廖俊国, 周弋荀. 基于多分支混合注意力的小目标检测算法[J]. 《计算机应用》唯一官方网站, 2023, 43(11): 3579-3586.
[8]	冯号, 黄朝兵, 文元桥. 基于改进YOLOv3的遥感图像小目标检测[J]. 《计算机应用》唯一官方网站, 2022, 42(12): 3723-3732.
[9]	谌贵辉, 易欣, 李忠兵, 钱济人, 陈伍. 基于改进YOLOv2和迁移学习的管道巡检航拍图像第三方施工目标检测[J]. 计算机应用, 2020, 40(4): 1062-1068.
[10]	刘兴淼王仕成赵静胡波. 基于改进双滑窗的红外小目标检测算法[J]. 计算机应用, 2011, 31(05): 1217-1220.

基于改进YOLOv5的Logo检测算法

Logo detection algorithm based on improved YOLOv5

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 19

参考文献 36

相关文章 10

编辑推荐

Metrics