《计算机应用》唯一官方网站 ›› 2024, Vol. 44 ›› Issue (8): 2580-2587.DOI: 10.11772/j.issn.1001-9081.2023081113
• 多媒体计算与计算机仿真 • 上一篇
收稿日期:
2023-08-18
修回日期:
2023-10-23
接受日期:
2023-11-02
发布日期:
2023-12-18
出版日期:
2024-08-10
通讯作者:
罗光圣
作者简介:
李烨恒(1997—),男,湖北武汉人,硕士研究生,主要研究方向:深度学习、目标检测基金资助:
Yeheng LI, Guangsheng LUO(), Qianmin SU
Received:
2023-08-18
Revised:
2023-10-23
Accepted:
2023-11-02
Online:
2023-12-18
Published:
2024-08-10
Contact:
Guangsheng LUO
About author:
bio graphy:LI Yeheng, born in 1997, M. S. candidate. His research interests include deep learning, object detection.Supported by:
摘要:
针对Logo图像背景复杂、Logo目标尺寸多变的问题,提出了一种基于YOLOv5的改进检测算法。首先,结合CBAM(Channel Block Attention Module),分别在图像通道与空间方向进行压缩,提取图像的关键信息与重要区域;然后,使用可变空洞卷积(SAC)使网络在不同尺度下自适应地调整特征图中的感受野大小,以捕获不同尺度下的物体信息,改善网络对多尺度目标的检测效果;最后,将归一化Wasserstein距离(NWD)嵌入损失函数,将边界框建模成2D的高斯分布,计算对应的高斯分布之间的相似度,更好地度量目标之间的相似性,提高对小目标的检测性能与模型鲁棒性和稳定性。实验结果表明,在数据量较小的数据集FlickrLogos-32中,改进后算法的平均精度均值(mAP@0.5)达到90.6%,比原始YOLOv5算法提升了1个百分点;在数据量较大的数据集QMULOpenLogo中,改进后算法的mAP@0.5达到62.7%,比原始YOLOv5算法提升了2.3个百分点;在针对特定类型的Logo检测集LogoDet3K中,针对3类商标改进后算法比原始算法的mAP@0.5分别提升了1.2、1.4与1.4个百分点,说明它有更好的Logo图像小目标检测能力。
中图分类号:
李烨恒, 罗光圣, 苏前敏. 基于改进YOLOv5的Logo检测算法[J]. 计算机应用, 2024, 44(8): 2580-2587.
Yeheng LI, Guangsheng LUO, Qianmin SU. Logo detection algorithm based on improved YOLOv5[J]. Journal of Computer Applications, 2024, 44(8): 2580-2587.
算法 | 时间复杂度 | 空间复杂度 |
---|---|---|
YOLOv3 | ||
YOLOv4 | ||
YOLOv5 | ||
本文算法 |
表1 不同算法的时间和空间复杂度对比
Tab. 1 Comparison of time and space complexities among different algorithms
算法 | 时间复杂度 | 空间复杂度 |
---|---|---|
YOLOv3 | ||
YOLOv4 | ||
YOLOv5 | ||
本文算法 |
数据集名称 | Logo种类数 | 图像数 | 目标数 |
---|---|---|---|
FlickrLogos-32 | 32 | 2 240 | 3 450 |
QMULOpenLogo | 352 | 27 083 | 51 207 |
LogoDet3K-Food | 932 | 53 350 | 64 276 |
LogoDet3K-Clothes | 604 | 31 266 | 37 601 |
LogoDet3K-Necessities | 432 | 24 822 | 30 643 |
表2 实验数据集信息统计
Tab. 2 Information statistics of experimental datasets
数据集名称 | Logo种类数 | 图像数 | 目标数 |
---|---|---|---|
FlickrLogos-32 | 32 | 2 240 | 3 450 |
QMULOpenLogo | 352 | 27 083 | 51 207 |
LogoDet3K-Food | 932 | 53 350 | 64 276 |
LogoDet3K-Clothes | 604 | 31 266 | 37 601 |
LogoDet3K-Necessities | 432 | 24 822 | 30 643 |
模型 | Backbone | mAP@0.5/% |
---|---|---|
Faster R-CNN | ResNet-50 | 83.50 |
SSD | VGG-16 | 80.20 |
Logo-YOLO | Darknet53 | 76.10 |
GFL | ResNet-50 | 86.00 |
Trinity-YOLO | Darknet53 | 85.38 |
YOLOv5 | Darknet53 | 89.60 |
本文模型 | Darknet53 | 90.60 |
表3 不同模型在FlickrLogos-32上实验结果对比
Tab. 3 Comparison of experimental results of different models on FlickrLogos-32
模型 | Backbone | mAP@0.5/% |
---|---|---|
Faster R-CNN | ResNet-50 | 83.50 |
SSD | VGG-16 | 80.20 |
Logo-YOLO | Darknet53 | 76.10 |
GFL | ResNet-50 | 86.00 |
Trinity-YOLO | Darknet53 | 85.38 |
YOLOv5 | Darknet53 | 89.60 |
本文模型 | Darknet53 | 90.60 |
CBAM | SE | SAC | NWD | mAP@0.5/% |
---|---|---|---|---|
89.6 | ||||
√ | 90.2 | |||
√ | 89.9 | |||
√ | √ | 90.3 | ||
√ | √ | √ | 90.6 | |
√ | 89.8 | |||
√ | √ | 90.1 | ||
√ | √ | √ | 90.5 |
表4 FlickrLogos-32上消融实验
Tab. 4 Ablation experiment results on FlickrLogos-32
CBAM | SE | SAC | NWD | mAP@0.5/% |
---|---|---|---|---|
89.6 | ||||
√ | 90.2 | |||
√ | 89.9 | |||
√ | √ | 90.3 | ||
√ | √ | √ | 90.6 | |
√ | 89.8 | |||
√ | √ | 90.1 | ||
√ | √ | √ | 90.5 |
模型 | Backbone | mAP@0.5/% |
---|---|---|
Faster R-CNN | ResNet-50 | 51.7 |
SSD | VGG-16 | 42.0 |
GFL | ResNet-50 | 49.1 |
ATSS | ResNet-50 | 48.8 |
YOLOv5 | Darknet53 | 60.4 |
本文模型 | Darknet53 | 62.7 |
表5 不同模型在QMULOpenLogo上实验结果对比
Tab. 5 Comparison of experimental results of different models on QMULOpenLogo
模型 | Backbone | mAP@0.5/% |
---|---|---|
Faster R-CNN | ResNet-50 | 51.7 |
SSD | VGG-16 | 42.0 |
GFL | ResNet-50 | 49.1 |
ATSS | ResNet-50 | 48.8 |
YOLOv5 | Darknet53 | 60.4 |
本文模型 | Darknet53 | 62.7 |
CBAM | SE | SAC | NWD | mAP@0.5/% |
---|---|---|---|---|
60.4 | ||||
√ | 60.8 | |||
√ | √ | 61.9 | ||
√ | √ | 61.6 | ||
√ | √ | √ | 62.7 | |
√ | 59.8 | |||
√ | √ | 60.7 | ||
√ | √ | 61.4 | ||
√ | √ | √ | 62.5 |
表6 QMULOpenLogo消融实验结果
Tab. 6 Ablation experiment results on QMULOpenLogo
CBAM | SE | SAC | NWD | mAP@0.5/% |
---|---|---|---|---|
60.4 | ||||
√ | 60.8 | |||
√ | √ | 61.9 | ||
√ | √ | 61.6 | ||
√ | √ | √ | 62.7 | |
√ | 59.8 | |||
√ | √ | 60.7 | ||
√ | √ | 61.4 | ||
√ | √ | √ | 62.5 |
数据集 | 算法 | mAP@0.5/% |
---|---|---|
LogoDet3K-Food | YOLOv5 | 87.3 |
本文算法 | 88.5 | |
LogoDet3K-Clothes | YOLOv5 | 91.2 |
本文算法 | 92.6 | |
LogoDet3K-Necessities | YOLOv5 | 89.8 |
本文算法 | 91.2 |
表7 LogoDet3K三类商标对比结果
Tab. 7 Comparison results of three types on LogoDet3K
数据集 | 算法 | mAP@0.5/% |
---|---|---|
LogoDet3K-Food | YOLOv5 | 87.3 |
本文算法 | 88.5 | |
LogoDet3K-Clothes | YOLOv5 | 91.2 |
本文算法 | 92.6 | |
LogoDet3K-Necessities | YOLOv5 | 89.8 |
本文算法 | 91.2 |
1 | 韩贵金,胡仲阳,石海宾.基于YOLOv4与位置先验的快递三段码检测算法[J].西安邮电大学学报,2021,26(4):105-110. |
HAN G J, HU Z Y, SHI H B. Three-segment waybill code detection algorithm based on YOLOv4 and location prior[J]. Journal of Xi’an University of Posts and Telecommunications, 2021,26(4):105-110. | |
2 | JIANG X, SUN K, MA L, et al. Vehicle Logo detection method based on improved YOLOv4[J]. Electronics, 2022, 11(20): 3400. |
3 | ISWARYA M, SHANKAR S A, HAMEED S A. Fake logo detection[C]// Proceedings of the 2022 1st International Conference on Computational Science and Technology. Piscataway: IEEE, 2022: 998-1001. |
4 | LINDEBERG T. Scale invariant feature transform[J]. Scholarpedia, 2012, 7(5): 10491. |
5 | BAY H, TUYTELAARS T, VAN GOOL L. SURF: speeded up robust features[C]// Proceedings of the 9th European Conference on Computer Vision. Berlin: Springer, 2006: 404-417. |
6 | DALAL N, TRIGGS B. Histograms of oriented gradients for human detection[C]// Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2005, 1: 886-893. |
7 | HOU S, LI J, MIN W, et al. Deep learning for Logo detection: a survey[EB/OL]. (2022-10-10)[2023-08-01]. . |
8 | REN S, HE K, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[C]// Proceedings of the 28th International Conference on Neural Information Processing Systems. Cambridge: MIT Press, 2015:91-99. |
9 | REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016:779-788. |
10 | REDMON J, FARHADI A. YOLO9000: better, faster, stronger[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 6517-6525. |
11 | REDMON J, FARHADI A. YOLOv3: an incremental improvement[EB/OL]. (2018-04-08)[2023-08-01]. . |
12 | BOCHKOVSKIY A, WANG C-Y, LIAO H-Y M. YOLOv4: optimal speed and accuracy of object detection[EB/OL]. (2020-04-23)[2023-08-01].. |
13 | ZHU X, LYU S, WANG X, et al. TPH-YOLOv5: improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios[C]// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 2778-2788. |
14 | IANDOLA F N, SHEN A, GAO P, et al. DeepLogo: Hitting logo recognition with the deep neural network hammer[EB/OL]. (2015-10-07)[2023-08-01].. |
15 | 江玉朝,吉立新,高超,等.基于卷积神经网络的多尺度Logo检测算法[J].网络与信息安全学报,2020,6(2):116-124. |
JIANG Y C, JI L X, GAO C, et al. Multi-scale Logo detection algorithm based on convolutional neural network[J]. Chinese Journal of Network and Information Security,2020,6(2):116-124. | |
16 | 王林,范亚臣.结合坐标注意力与自适应残差连接的logo检测[J].计算机系统应用,2022,31(5):137-146. |
WANG L, FAN Y C. Logo detection combining coordinate attention and adaptive residual connection[J]. Computer Systems & Applications, 2022,31(5): 137-146. | |
17 | MAO K J, JIN R H, CHEN K Y, et al. Trinity-YOLO: high-precision logo detection in the real world[J]. IET Image Processing, 2023, 17(7): 2272-2283. |
18 | WANG J, MIN W, HOU S, et al. LogoDet-3K: a large-scale image dataset for logo detection[J]. ACM Transactions on Multimedia Computing, Communications, and Applications, 2022, 18(1): 22. |
19 | ZHANG B, HOU S, KARIM A, et al. Discriminative semantic feature pyramid network with guided anchoring for logo detection[J]. Mathematics, 2023, 11(2): 481. |
20 | LI X, HOU S, ZHANG B, et al. Long-range dependence involutional network for logo detection[J]. Entropy, 2023, 25(1): 174. |
21 | HOU S, LIU W, KARIM A, et al. Few-shot logo detection[J]. IET Computer Vision, 2023, 17(5):586-598. |
22 | LI Y, XUE J, ZHANG M, et al. YOLOv5-ASFF: a multistage strawberry detection algorithm based on improved YOLOv5[J]. Agronomy, 2023, 13(7): 1901. |
23 | MAHAUR B, MISHRA K K. Small-object detection based on YOLOv5 in autonomous driving systems[J]. Pattern Recognition Letters, 2023, 168: 115-122. |
24 | WOO S, PARK J, LEE J-Y, et al. CBAM: convolutional block attention module[C]// Proceedings of the 15th European Conference on Computer Vision. Cham: Springer, 2018: 3-19. |
25 | MA K, ZHAN C A, YANG F. Multi-classification of arrhythmias using ResNet with CBAM on CWGAN-GP augmented ECG Gramian Angular Summation Field[J]. Biomedical Signal Processing and Control, 2022, 77: 103684. |
26 | WANG L, CAO Y, WANG S, et al. Investigation into recognition algorithm of helmet violation based on YOLOv5-CBAM-DCN[J]. IEEE Access, 2022, 10: 60622-60632. |
27 | HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C] // Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 7132-7141. |
28 | ZHU X, CHENG D, ZHANG Z, et al. An empirical study of spatial attention mechanisms in deep networks[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 6687-6696. |
29 | QIAO S, CHEN L-C, YUILLE A. DetectoRS: detecting objects with recursive feature pyramid and switchable atrous convolution[C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 10208-10219. |
30 | WANG J, XU C, YANG W, et al. A normalized Gaussian Wasserstein distance for tiny object detection[EB/OL]. (2021-10-26)[2023-08-01]. . |
31 | ROMBERG S, PUEYO L G, LIENHART R, et al. Scalable logo recognition in real-world images[C]// Proceedings of the 1st ACM International Conference on Multimedia Retrieval. New York: ACM, 2011: 25. |
32 | SU H, ZHU X, GONG S. Open logo detection challenge[EB/OL]. [2023-08-01]. . |
33 | KUMAR A, ZHANG Z J, LYU H. Object detection in real time based on improved single shot multi-box detector algorithm[J]. EURASIP Journal on Wireless Communications and Networking, 2020, 2020: 1-18. |
34 | LI X, WANG W, WU L, et al. Generalized focal loss: learning qualified and distributed bounding boxes for dense object detection[J]. Advances in Neural Information Processing Systems, 2020, 33: 21002-21012. |
35 | MAO K J, JIN R H, CHEN K Y, et al. Trinity-YOLO: high-precision logo detection in the real world[J]. IET Image Processing, 2023, 17(7): 2272-2283. |
36 | ZHANG S, CHI C, YAO Y, et al. Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 9759-9768. |
[1] | 姬张建, 杜娜. 基于改进VariFocalNet的微小目标检测[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2200-2207. |
[2] | 刘越, 刘芳, 武奥运, 柴秋月, 王天笑. 基于自注意力机制与图卷积的3D目标检测网络[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1972-1977. |
[3] | 梁美佳, 刘昕武, 胡晓鹏. 基于改进YOLOv3的列车运行环境图像小目标检测算法[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2611-2618. |
[4] | 吕宗喆, 徐慧, 杨骁, 王勇, 王唯鉴. 面向小目标的YOLOv5安全帽检测算法[J]. 《计算机应用》唯一官方网站, 2023, 43(6): 1943-1949. |
[5] | 秦强强, 廖俊国, 周弋荀. 基于多分支混合注意力的小目标检测算法[J]. 《计算机应用》唯一官方网站, 2023, 43(11): 3579-3586. |
[6] | 冯号, 黄朝兵, 文元桥. 基于改进YOLOv3的遥感图像小目标检测[J]. 《计算机应用》唯一官方网站, 2022, 42(12): 3723-3732. |
[7] | 谌贵辉, 易欣, 李忠兵, 钱济人, 陈伍. 基于改进YOLOv2和迁移学习的管道巡检航拍图像第三方施工目标检测[J]. 计算机应用, 2020, 40(4): 1062-1068. |
[8] | 刘兴淼 王仕成 赵静 胡波. 基于改进双滑窗的红外小目标检测算法[J]. 计算机应用, 2011, 31(05): 1217-1220. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||