YOLO算法及其在自动驾驶场景中目标检测综述

doi:10.11772/j.issn.1001-9081.2023060889

《计算机应用》唯一官方网站 ›› 2024, Vol. 44 ›› Issue (6): 1949-1958.DOI: 10.11772/j.issn.1001-9081.2023060889

所属专题：前沿与综合应用；综述

YOLO算法及其在自动驾驶场景中目标检测综述

邓亚平, 李迎江()

重庆理工大学计算机科学与工程学院，重庆 400054

收稿日期:2023-07-07 修回日期:2023-08-20 接受日期:2023-08-24 发布日期:2023-09-11 出版日期:2024-06-10
通讯作者: 李迎江
作者简介:邓亚平（2000—），女，重庆人，硕士研究生，CCF会员，主要研究方向：目标检测、图像处理；
基金资助:
重庆理工大学科研启动基金资助项目(2019ZD112)

Review of YOLO algorithm and its applications to object detection in autonomous driving scenes

Yaping DENG, Yingjiang LI()

College of Computer Science and Engineering，Chongqing University of Technology，Chongqing 400054，China

Received:2023-07-07 Revised:2023-08-20 Accepted:2023-08-24 Online:2023-09-11 Published:2024-06-10
Contact: Yingjiang LI
About author:DENG Yaping， born in 2000， M. S. candidate， Her research interests include object detection， image processing.
Supported by:
Chongqing University of Technology Research Start-up Fund(2019ZD112)

摘要/Abstract

摘要：

自动驾驶场景下的目标检测是计算机视觉中重要研究方向之一，确保自动驾驶汽车对物体进行实时准确的目标检测是研究重点。近年来，深度学习技术迅速发展并被广泛应用于自动驾驶领域中，极大促进了自动驾驶领域的进步。为此，针对YOLO（You Only Look Once）算法在自动驾驶领域中的目标检测研究现状，从以下4个方面分析。首先，总结单阶段YOLO系列检测算法思想及其改进方法，分析YOLO系列算法的优缺点；其次，论述YOLO算法在自动驾驶场景下目标检测中的应用，从交通车辆、行人和交通信号识别这3个方面分别阐述和总结研究现状及应用情况；此外，总结目标检测中常用的评价指标、目标检测数据集和自动驾驶场景数据集；最后，展望目标检测存在的问题和未来发展方向。

关键词: 目标检测, 自动驾驶, 实时检测, YOLO算法, 交通场景

Abstract:

Object detection in autonomous driving scenes is one of the important research directions in computer vision. The researches focus on ensuring real-time and accurate object detection of objects by autonomous vehicles. Recently， a rapid development in deep learning technology had been witnessed， and its wide application in the field of autonomous driving had prompted substantial progress in this field. An analysis was conducted on the research status of object detection by YOLO （You Only Look Once） algorithms in the field of autonomous driving from the following four aspects. Firstly， the ideas and improvement methods of the single-stage YOLO series of detection algorithms were summarized， and the advantages and disadvantages of the YOLO series of algorithms were analyzed. Secondly， the YOLO algorithm-based object detection applications in autonomous driving scenes were introduced， the research status and applications for the detection and recognition of traffic vehicles， pedestrians， and traffic signals were expounded and summarized respectively. Additionally， the commonly used evaluation indicators in object detection， as well as the object detection datasets and automatic driving scene datasets， were summarized. Lastly， the problems and future development directions of object detection were discussed.

Key words: object detection, autonomous driving, real-time detection, YOLO (You Only Look Once) algorithm, traffic scene

中图分类号:

TP399

邓亚平, 李迎江. YOLO算法及其在自动驾驶场景中目标检测综述[J]. 计算机应用, 2024, 44(6): 1949-1958.

Yaping DENG, Yingjiang LI. Review of YOLO algorithm and its applications to object detection in autonomous driving scenes[J]. Journal of Computer Applications, 2024, 44(6): 1949-1958.

图/表 8

参考文献 81

1	ZOU Z， CHEN K， SHI Z， et al. Object detection in 20 years： a survey ［J］. Proceedings of the IEEE， 2023， 111（3）： 257-276.
2	ZHAO Z-Q， ZHENG P， XU S-T， et al. Object detection with deep learning： a review ［J］. IEEE Transactions on Neural Networks and Learning Systems， 2019， 30（11）： 3212-3232.
3	VIOLA P， JONES M. Rapid object detection using a boosted cascade of simple features ［C］// Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2001， 1： I-511 - I-518.
4	DALAL N， TRIGGS B. Histograms of oriented gradients for human detection ［C］// Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2005， 1： 886-893.
5	FELZENSZWALB P， McALLESTER D， RAMANAN D. A discriminatively trained， multiscale， deformable part model ［C］// Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2008： 1-8.
6	曹家乐，李亚利，孙汉卿，等. 基于深度学习的视觉目标检测技术综述［J］. 中国图象图形学报， 2022， 27（6）： 1697-1722.
	CAO J L， LI Y L， SUN H Q， et al. A survey on deep learning based visual object detection ［J］. Journal of Image and Graphics， 2022， 27（6）： 1697-1722.
7	DIWAN T， ANIRUDH G， TEMBHURNE J V. Object detection using YOLO： challenges， architectural successors， datasets and applications ［J］. Multimedia Tools and Applications， 2022， 82（6）： 9243-9275.
8	GIRSHICK R， DONAHUE J， DARRELL T， et al. Rich feature hierarchies for accurate object detection and semantic segmentation ［C］// Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2014： 580-587.
9	REN S， HE K， GIRSHICK R， et al. Faster R-CNN： towards real-time object detection with region proposal networks ［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2017， 39（6）： 1137-1149.
10	HE K， GKIOXARI G， DOLLÁR P， et al. Mask R-CNN ［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 2980-2988.
11	REDMON J， DIVVALA S， GIRSHICK R， et al. You only look once： unified， real-time object detection ［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 779-788.
12	LIU W， ANGUELOV D， ERHAN D， et al. SSD： single shot MultiBox detector ［C］// Proceedings of the 14th European Conference on Computer Vision. Cham： Springer， 2016： 21-37.
13	ZHU X， SU W， LU L， et al. Deformable DETR： deformable transformers for end-to-end object detection ［EB/OL］. （2020-10-08）［2023-05-23］. .
14	REZATOFIGHI H， TSOI N， GWAK J Y， et al. Generalized intersection over union： a metric and a loss for bounding box regression ［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2019： 658-666.
15	NEUBECK A， VAN GOOL L. Efficient non-maximum suppression ［C］// Proceedings of the 18th International Conference on Pattern Recognition. Piscataway： IEEE， 2006： 850-855.
16	GUO J， HAN K， WANG Y， et al. Hit-detector： hierarchical trinity architecture search for object detection ［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 11402-11411.
17	SZEGEDY C， LIU W， JIA Y， et al. Going deeper with convolutions ［C］// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2015： 1-9.
18	MASS A L， HANNUN A Y， NG A Y. Rectifier nonlinearities improve neural network acoustic models ［EB/OL］.［2023-05-30］. .
19	REDMON J， FARHADIA. YOLO9000： better， faster， stronger ［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 6517-6525.
20	SIMONYAN K， ZISSERMAN A. Very deep convolutional networks for large-scale image recognition ［EB/OL］. （2014-05-01）［2023-05-29］. .
21	LIN M， CHEN Q， YAN S. Network in network ［EB/OL］. （2013-12-16）［2023-05-29］. .
22	REDMON J， FARHADI A. YOLOv3： an incremental improvement ［EB/OL］. （2018-04-08）［2023-05-29］. .
23	HE K， ZHANG X， REN S， et al. Deep residual learning for image recognition ［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 770-778.
24	LIN T-Y， DOLLΆR P， GIRSHICK R， et al. Feature pyramid networks for object detection ［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 936-944.
25	BOCHKOVSKIY A， WANG C-Y， LIAO H-Y M. YOLOv4： optimal speed and accuracy of object detection ［EB/OL］. （2020-04-23）［2023-05-29］. .
26	ZHENG Z， WANG P， LIU W， et al. Distance-IoU loss： faster and better learning for bounding box regression ［EB/OL］. （2019-11-19）［2023-08-07］. .
27	MISRA D. Mish： a self regularized non-monotonic activation function ［EB/OL］. （2019-08-23）［2023-05-29］. .
28	WANG C-Y， LIAO H_Y M， WU Y-H， et al. CSPNet： a new backbone that can enhance learning capability of CNN ［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Piscataway： IEEE， 2020： 1571-1580.
29	LIU S， QI L， QIN H， et al. Path aggregation network for instance segmentation ［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 8759-8768.
30	HE K， ZHANG X， REN S， et al. Spatial pyramid pooling in deep convolutional networks for visual recognition ［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2015， 37（9）： 1904-1916.
31	NELSON J， SOLAWETZ J. YOLOv5 is here： state-of-the-art object detection at 140 FPS ［EB/OL］. （2020-06-10）［2023-05-30］. .
32	GE Z， LIU S， WANG F， et al. YOLOX： exceeding YOLO series in 2021 ［EB/OL］. （2021-07-18）［2023-05-30］. .
33	LAW H， DENG J. CornerNet： detecting objects as paired keypoints ［C］// Proceedings of the 15th European Conference on Computer Vision. Cham： Springer， 2018： 765-781.
34	DUAN K， BAI S， XIE L， et al. CenterNet： keypoint triplets for object detection ［C］// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2019： 6568-6577.
35	ZHANG H， CISSE M， DAUPHIN Y N， et al. Mixup： beyond empirical risk minimization ［EB/OL］. （2017-10-25）［2023-05-30］. .
36	GE Z， LIU S， LI Z， et al. OTA： optimal transport assignment for object detection ［C］// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2021： 303-312.
37	LI C， LI L， JIANG H， et al. YOLOv6： a single-stage object detection framework for industrial applications ［EB/OL］. （2022-09-07）［2023-06-01］. .
38	DING X， ZHANG X， MA N， et al. RepVGG： making VGG-style ConvNets great again ［C］// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2021： 13728-13737.
39	GEVORGYAN Z. SIoU loss： more powerful learning for bounding box regression ［EB/OL］. （2022-05-25）［2023-06-01］. .
40	WANG C-Y， BOCHKOVSKIY A， LIAO H-Y M. YOLOv7： trainable bag-of-freebies sets new state-of-the-art for real-time object detectors ［EB/OL］. （2022-07-06）［2023-06-01］. .
41	SOLAWETZ J， FRANCESCO. What is YOLOv8？ The ultimate guide ［EB/OL］. （2023-01-11）［2023-06-02］. .
42	LIANG S， WU H， ZHEN L， et al. Edge YOLO： real-time intelligent object detection system based on edge-cloud cooperation in autonomous vehicles ［J］. IEEE Transactions on Intelligent Transportation Systems， 2022， 23（12）： 25345-25360.
43	MAO Q-C， SUN H-M， ZUO L-Q， et al. Finding every car： a traffic surveillance multi-scale vehicle object detection method ［J］. Applied Intelligence， 2020， 50： 3125-3136.
44	CARRASCO D P， RASHWAN H A， GARCÍA M Á， et al. T-YOLO： tiny vehicle detection based on YOLO and multi-scale convolutional neural networks ［J］. IEEE Access， 2023， 11： 22430-22440.
45	LI Y， DING H， HU P， et al. Real-time detection algorithm for non-motorized vehicles based on D-YOLO model ［J/OL］. Multimedia Tool and Applications （2023-01-25）［2023-08-07］. .
46	叶佳林，苏子毅，马浩炎，等.改进YOLOv3的非机动车检测与识别方法［J］.计算机工程与应用，2021，57（1）：194-199.
	YE J L， SU Z Y， MA H Y， et al. Improved YOLOv3 non-motor vehicles detection and recognition method ［J］. Computer Engineering and Applications， 2021，57（1）：194-199.
47	ARNOLD E， Al-JARRAH O Y， DIANATI M， et al. A survey on 3D object detection methods for autonomous driving applications ［J］. IEEE Transactions on Intelligent Transportation Systems， 2019， 20（10）： 3782-3795.
48	SIMON M， MILZ S， AMENDE K， et al. Complex-YOLO： real-time 3D object detection on point clouds ［EB/OL］. （2018-09-24）［2023-06-14］. .
49	SIMON M， AMENDE K， KARUS A， et al. Complexer-YOLO： real-time 3D object detection and tracking on semantic point clouds ［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Piscataway： IEEE， 2019：1190-1199.
50	TAKAHASHI M， JI Y， UMEDA K， et al. Expandable YOLO： 3D object detection from RGB-D images ［C］// Proceedings of the 2020 21st International Conference on Research and Education in Mechatronics. Piscataway： IEEE， 2020： 1-5.
51	ALI W， ABDELKARIM S， ZIDAN M， et al. YOLO3D： end-to-end real-time 3D oriented object bounding box detection from LiDAR point cloud ［C］// Proceedings of the 15th European Conference on Computer Vision Workshops. Cham： Springer， 2018： 716-728.
52	XU L， YAN W， JI J. The research of a novel WOG-YOLO algorithm for autonomous driving object detection ［J］. Scientific Reports， 2023， 13（1）： 3699.
53	HSU W-Y， LIN W-Y. Ratio-and-scale-aware YOLO for pedestrian detection ［J］. IEEE Transactions on Image Processing， 2020， 30： 934-947.
54	HSU W-Y， CHEN P-C. Pedestrian detection using stationary wavelet dilated residual super-resolution ［J］. IEEE Transactions on Instrumentation and Measurement， 2022， 71： 5001411.
55	LI C， WANG Y， LIU X. An improved YOLOv7 lightweight detection algorithm for obscured pedestrians ［J］. Sensors， 2023， 23（13）： 5912.
56	刘丽，郑洋，付冬梅. 改进 YOLOv3 网络结构的遮挡行人检测算法［J］. 模式识别与人工智能， 2020， 33（6）： 568-574.
	LIU L， ZHENG Y， FU D M. Improved YOLOv3 network structure occluded pedestrian detection algorithm ［J］. Pattern Recognition and Artificial Intelligence， 2020， 33（6）： 568-574.
57	LI X， HE M， LIU Y， et al. SPCS： a spatial pyramid convolutional shuffle module for YOLO to detect occluded object ［J］. Complex & Intelligent Systems， 2023， 9： 301-315.
58	XUE Y， JU Z， LI Y， et al. MAF-YOLO： multi-modal attention fusion based YOLO for pedestrian detection ［J］. Infrared Physics & Technology， 2021， 118： 103906.
59	施政，毛力，孙俊.基于YOLO的多模态加权融合行人检测算法［J］.计算机工程，2021，47（8）：234-242.
	SHI Z， MAO L， SUN J. YOLO-based Multi-modal weighted fusion pedestrian detection algorithm ［J］. Computer Engineering， 2021，47（8）：234-242.
60	BENJUMEA A， TEETI I， CUZZOLIN F， et al. YOLO-Z： improving small object detection in YOLOv5 for autonomous vehicles ［EB/OL］. （2021-12-22）［2023-06-03］. .
61	YU J， YE X， TU Q. Traffic sign detection and recognition in multiimages using a fusion model with YOLO and VGG network ［J］. IEEE Transactions on Intelligent Transportation Systems， 2022， 23（9）： 16632-16642.
62	MAO K， JIN R， YING L， et al. SC-YOLO： provide application-level recognition and perception capabilities for smart city industrial cyber-physical system ［J］. IEEE Systems Journal， 2023， 17（4）： 5118-5129.
63	ZHANG X， YANG W， TANG X， et al. A fast learning method for accurate and robust lane detection using two-stage feature extraction with YOLOv3 ［J］. Sensors， 2018， 18（12）： 4308.
64	张翔，唐小林，黄岩军. 道路结构特征下的车道线智能检测［J］.中国图象图形学报， 2021， 26（1）： 123-134.
	ZHANG X， TANG X L， HUANG Y J. Intelligent detection of lane based on road structure characteristics ［J］. Journal of Image and Graphics， 2021， 26（1）： 123-134.
65	WU D， JIA F， CHANG J， et al. The 1st-place solution for CVPR 2023 OpenLane topology in autonomous driving challenge ［EB/OL］. （2023-06-16）［2023-08-01］. .
66	JENSEN M B， NASROLLAHI K， MOESLUND T B. Evaluating state-of-the-art object detector on challenging traffic light data ［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops. Piscataway： IEEE， 2017： 882-888.
67	钱伍，王国中，李国平. 改进YOLOv5的交通灯实时检测鲁棒算法［J］. 计算机科学与探索， 2022， 16（1）： 231-241.
	QIAN W， WANG G Z， LI G P. Improved YOLOv5 traffic light real-time detection robust algorithm［J］. Journal of Frontiers of Computer Science & Technology， 2022， 16（1）： 231-241.
68	孙迎春，潘树国，赵涛，等.基于优化YOLOv3算法的交通灯检测［J］.光学学报，2020，40（12）： 1215001.
	SUN Y C， PAN S G， ZHAO T， et al. Traffic light detection based on optimized YOLOv3 algorithm ［J］. Acta Optica Sinica， 2020， 40（12）： 1215001.
69	EVERINGHAM M， GOOLL VAN， WILLIAMS C K I， et al. The PASCAL Visual Object Classes （VOC） challenge ［J］. International Journal of Computer Vision， 2010， 88： 303-338.
70	LIN T-Y， MAIRE M， BELONGIE S， et al. Microsoft COCO： common objects in context ［EB/OL］. （2014-05-01）［2023-06-04］. .
71	DENG J， DONG W， SOCHER R， et al. ImageNet： a large-scale hierarchical image database ［C］// Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2009： 248-255.
72	KUZNETSOVA A， ROM H， ALLDRIN N， et al. The open images dataset v4： unified image classification， object detection， and visual relationship detection at scale ［J］. International Journal of Computer Vision， 2020， 128： 1956-1981.
73	GEIGER A， LENZ P， URTASUN R. Are we ready for autonomous driving？ The KITTI vision benchmark suite ［C］// Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2012： 3354-3361.
74	CAESAR H， BANKITI V， LANG A H， et al. nuScenes： a multimodal dataset for autonomous driving ［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 11618-11628.
75	SUN P， KRETZSCHMAR H， DOTIWALLA X， et al. Scalability in perception for autonomous driving： Waymo open dataset ［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 2443-2451.
76	HUANG X， CHENG X， GENG Q， et al. The ApolloScape dataset for autonomous driving ［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Piscataway： IEEE， 2018： 1067-10676.
77	YU F， CHEN H， WANG X， et al. BDD100K： a diverse driving dataset for heterogeneous multitask learning ［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 2633-2642.
78	ESS A， LEIBE B， VAN GOOL L. Depth and appearance for mobile scene analysis ［C］// Proceedings of the 2007 IEEE 11th International Conference on Computer Vision. Piscataway： IEEE， 2007： 1-8.
79	DALAL N， TRIGGS B. Histograms of oriented gradients for human detection ［C］// Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2005： 886-893.
80	JENSEN M B， PHILIPSEN M P， MØGELMOSE A， et al. Vision for looking at traffic lights： issues， survey， and perspectives ［J］. IEEE Transactions on Intelligent Transportation Systems， 2016， 17（7）： 1800-1815.
81	ZHU Z， LIANG D， ZHANG S H， et al. Traffic-sign detection and classification in the wild ［C］// Proceedings in the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 2110-2118.

检测框架	检测基准	输入尺寸	FPS	AP/%	AP50/%
YOLOv1	PASCAL VOC2007	448		63.4
YOLOv2	PASCAL VOC2007	416		78.6
YOLOv3	COCO test2017	416	35	31.0	55.3
YOLOv4	COCO test2017	608	65	43.5	65.7
YOLOv5s	COCO test2017	640	170	41.2	55.4
YOLOX-L	Tesla V100	640	94	49.7	68.0
YOLOv6-L	COCO test2017	640	98	52.8	70.3
YOLOv7-E6	COCO test2017	1 280	16	56.8	74.4
YOLOv8-L	COCO test2017	640	91	53.9	69.8

检测框架	检测基准	输入尺寸	FPS	AP/%	AP50/%
YOLOv1	PASCAL VOC2007	448		63.4
YOLOv2	PASCAL VOC2007	416		78.6
YOLOv3	COCO test2017	416	35	31.0	55.3
YOLOv4	COCO test2017	608	65	43.5	65.7
YOLOv5s	COCO test2017	640	170	41.2	55.4
YOLOX-L	Tesla V100	640	94	49.7	68.0
YOLOv6-L	COCO test2017	640	98	52.8	70.3
YOLOv7-E6	COCO test2017	1 280	16	56.8	74.4
YOLOv8-L	COCO test2017	640	91	53.9	69.8

应用	文献序号	算法	主要改进方式	AP/ %	mAP/%	FPS
2D 目标	［42］	Edge YOLO	基于边云协作和重构		47.30	26.60
	［43］	YOLOv3	引入SPP模块和 Soft-NMS	95.92		25.00
	［44］	YOLOv5	使用多尺度机制	96.34		30.00
	［45］	YOLOv4- tiny	设计D-CSPNet和 SPP		70.36	117.50
	［46］	YOLOv3	使用GIoU 损失函数		60.90
3D 目标	［48］	YOLOv2	设计E-RPN	67.72		50.40
	［49］	Complex- YOLO	引入SRT	55.63		15.60
	［50］	YOLOv3	引入3D空间			44.35
	［51］	YOLOv2			75.30	40.00

应用	文献序号	算法	主要改进方式	AP/ %	mAP/%	FPS
2D 目标	［42］	Edge YOLO	基于边云协作和重构		47.30	26.60
	［43］	YOLOv3	引入SPP模块和 Soft-NMS	95.92		25.00
	［44］	YOLOv5	使用多尺度机制	96.34		30.00
	［45］	YOLOv4- tiny	设计D-CSPNet和 SPP		70.36	117.50
	［46］	YOLOv3	使用GIoU 损失函数		60.90
3D 目标	［48］	YOLOv2	设计E-RPN	67.72		50.40
	［49］	Complex- YOLO	引入SRT	55.63		15.60
	［50］	YOLOv3	引入3D空间			44.35
	［51］	YOLOv2			75.30	40.00

应用	文献序号	算法	主要改进方式	AP/%	mAP/%	FPS
小尺寸	［52］	YOLOv5	设计Grey-C3模块		91.80
	［53］	YOLOv3	引入 ratio-aware机制	74.20
	［54］	YOLOv4	采用小波变换	95.63
遮挡	［55］	YOLOv7	修改骨干网络		89.75
	［56］	YOLOv3	采用SPP和剪枝方法	93.80	94.20	50.0
	［57］	YOLOv4	设计空间金字塔卷积洗牌模块	94.11
多模态识别	［58］	YOLOv3	设计多模态注意力模块
多模态识别	［59］	YOLOv3	融合可见光和红外光	92.60		19.8

YOLO算法及其在自动驾驶场景中目标检测综述

Review of YOLO algorithm and its applications to object detection in autonomous driving scenes

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 8

参考文献 81

相关文章 15

编辑推荐

Metrics

应用	文献序号	算法	主要改进方式	AP/ %	mAP/ %	FPS
交通标志	［60］	YOLOv5	替换结构参数
	［61］	YOLOv3	融合VGG 网络模型	90.00
	［62］	YOLOv7	SIoU损失函数和注意力机制		70.84
遮挡	［63］	YOLOv3	修改结构参数		88.39	29.30
	［64］	YOLOv3	修改网络结构		88.39
	［65］	YOLOv8	设计多层感知器的拓扑预测头
多模态识别	［66］	YOLOv2		90.49
	［67］	YOLOv5	修改骨干网络	74.30		111.00
	［68］	YOLOv3	精简网络结构	46.78		33.00

检测目标	数据集名称	数据集介绍	来源
交通车辆	KITTI^［73］	可用于目标检测、跟踪和语义分割等	德国卡尔斯鲁厄理工学院和丰田美国技术研究院联合
	nuScenes^［74］	包含图像、激光雷达扫码数据和雷达数据，是具有3D信息的数据集	Aptiv公司
	Waymo open^［75］	大规模自动驾驶数据集，含3种不同道路场景的数据	Waymo公司
	ApolloScape^［76］	用于目标检测、语义分割和深度估计等，含3D信息	中国百度公司
	BDD100K^［77］	当前最大自动驾驶场景数据集	加利福尼亚大学伯克利分校
行人	ETH^［78］	用于行人检测，由安装在汽车上的立体装置捕获图像，测试集来自3个视频剪辑的1 804张图像	苏黎世联邦理工学院
行人	INRIA^［79］	一般用于静态行人检测，含有3 500多图像	法国国家计算机与自动化研究所
交通标志	LISA^［80］	不同相机采集的47种美国交通标志的图像和视频	德国卡尔斯鲁厄理工学院
交通标志	TT100K^［81］	用于检测交通标志，共10万张图像含有3万个交通标志实例	清华和腾讯联合

[1]	潘烨新, 杨哲. 基于多级特征双向融合的小目标检测优化模型[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2871-2877.
[2]	张英俊, 李牛牛, 谢斌红, 张睿, 陆望东. 课程学习指导下的半监督目标检测框架[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2326-2333.
[3]	李烨恒, 罗光圣, 苏前敏. 基于改进YOLOv5的Logo检测算法[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2580-2587.
[4]	徐松, 张文博, 王一帆. 基于时空信息的轻量视频显著性目标检测网络[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2192-2199.
[5]	孙逊, 冯睿锋, 陈彦如. 基于深度与实例分割融合的单目3D目标检测方法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2208-2215.
[6]	姬张建, 杜娜. 基于改进VariFocalNet的微小目标检测[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2200-2207.
[7]	刘越, 刘芳, 武奥运, 柴秋月, 王天笑. 基于自注意力机制与图卷积的3D目标检测网络[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1972-1977.
[8]	葛超, 张嘉滨, 王蕾, 伦志新. 基于模型预测控制的自动驾驶车辆轨迹规划[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1959-1964.
[9]	耿焕同, 刘振宇, 蒋骏, 范子辰, 李嘉兴. 基于改进YOLOv8的嵌入式道路裂缝检测算法[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1613-1618.
[10]	李鸿天, 史鑫昊, 潘卫国, 徐成, 徐冰心, 袁家政. 融合多尺度和注意力机制的小样本目标检测[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1437-1444.
[11]	宋霄罡, 张冬冬, 张鹏飞, 梁莉, 黑新宏. 面向复杂施工环境的实时目标检测算法[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1605-1612.
[12]	陈天华, 朱家煊, 印杰. 基于注意力机制的鸟类识别算法[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1114-1120.
[13]	李雨秋, 侯利萍, 薛健, 吕科, 王泳. 基于内容解译的遥感图像推荐方法[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 722-731.
[14]	李新叶, 侯晔凝, 孔英会, 燕志旗. 结合特征融合与增强注意力的少样本目标检测[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 745-751.
[15]	王伟, 赵春辉, 唐心瑶, 席刘钢. 自适应地平线约束下的车辆三维检测[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 909-915.