基于雷达和相机融合的目标检测方法

doi:10.11772/j.issn.1001-9081.2021020327

《计算机应用》唯一官方网站 ›› 2021, Vol. 41 ›› Issue (11): 3242-3250.DOI: 10.11772/j.issn.1001-9081.2021020327

基于雷达和相机融合的目标检测方法

高洁¹, 朱元²(), 陆科²

^1.同济大学中德学院，上海 200092
^2.同济大学汽车学院，上海 201804

收稿日期:2021-03-05 修回日期:2021-04-15 接受日期:2021-04-20 发布日期:2021-04-29 出版日期:2021-11-10
通讯作者: 朱元
作者简介:高洁（1996—），女，贵州六盘水人，硕士研究生，主要研究方向：多传感器融合、多目标跟踪、目标检测
朱元（1976—），男，江苏泰州人，副教授，博士，主要研究方向：新能源汽车电气驱动系统、汽车电子嵌入式软件、智能驾驶多传感器融合
陆科（1983—），男，江苏常州人，工程师，博士，主要研究方向：汽车电子嵌入式系统、自动驾驶汽车感知算法、功能安全。

Object detection method based on radar and camera fusion

Jie GAO¹, Yuan ZHU²(), Ke LU²

^1.Chinesisch-Deutsches Hochschulkolleg，Tongji University，Shanghai 200092，China
^2.School of Automotive Studies，Tongji University，Shanghai 201804，China

Received:2021-03-05 Revised:2021-04-15 Accepted:2021-04-20 Online:2021-04-29 Published:2021-11-10
Contact: Yuan ZHU
About author:GAO Jie，born in 1996，M. S. candidate. Her research interests include multi-sensor fusion，multi-object tracking，object detection
ZHU Yuan， born in 1976， Ph. D.， associate professor. His research interests include electrical drive system of new energy vehicles， embedded software for automotive electronics，multi-sensor fusion for intelligent driving
LU Ke，born in 1983，Ph. D.，engineer. His research interests include automotive embedded system for automotive electronics， perception algorithms for autonomous vehicles，functional safety.

摘要/Abstract

摘要：

在自动驾驶感知系统中，为了提高感知结果的可靠度，通常采用多传感器融合的方法。针对融合感知系统中的目标检测任务，提出了基于雷达和相机融合的目标检测方法——PRRPN，旨在使用雷达测量和前一帧目标检测结果来改进图像检测网络中的候选区域生成，并提高目标检测性能。首先，将前一帧检测到的目标与当前帧中的雷达点进行关联，以实现雷达预分类。然后，将预分类后的雷达点投影到图像中，并根据雷达的距离和雷达散射截面积（RCS）信息获得相应的先验候选区域和雷达候选区域。最后，根据候选区域进行目标边界框的回归和分类。此外，还将PRRPN与区域生成网络（RPN）融合到一起来进行目标检测。使用新发布的nuScenes数据集来对三种检测方法进行测试评估。实验结果表明，与RPN相比，PRRPN不仅可以更快速地实现目标检测，而且还使得小目标的平均检测精度提升了2.09个百分点；而将所提PRRPN与RPN进行融合的方法，与单独使用PRRPN和RPN相比，平均检测精度分别提升了2.54个百分点和0.34个百分点。

关键词: 目标检测, 神经网络, 传感器融合, 雷达, 相机

Abstract:

In the automatic driving perception system， multi-sensor fusion is usually used to improve the reliability of the perception results. Aiming at the task of object detection in fusion perception system， a object detection method based on radar and camera fusion， namely Priori and Radar Region Proposal Network （PRRPN）， was proposed，with the aim of using radar measurement and the object detection result of the previous frame to improve the generation of region proposals in the image detection network and improve the object detection performance. Firstly， the objects detected in the previous frame with the radar points in the current frame were associated to pre-classify the radar points. Then， the pre-classified radar points were projected into the image， and the corresponding prior region proposals and radar region proposals were obtained according to the distance of the radar and Radar Cross Section （RCS） information. Finally， the regression and classification of the object bounding boxes were performed according to the region proposals. In addition， PRRPN and Region Proposal Network （RPN） were fused to carry out object detection. The newly released nuScenes dataset was adopted to test and evaluate the three detection methods. Experimental results show that， compared with RPN， the proposed PRRPN can not only detect objects faster， but also increase the average detection accuracy of small objects by 2.09 percentage points. And compared with the methods by using PRRPN and RPN alone， the method by fusing the proposed PRRPN and RPN has the average detection accuracy increased by 2.54 percentage points and 0.34 percentage points respectively.

Key words: object detection, neural network, sensor fusion, radar, camera

中图分类号:

TP391.4

高洁, 朱元, 陆科. 基于雷达和相机融合的目标检测方法[J]. 计算机应用, 2021, 41(11): 3242-3250.

Jie GAO, Yuan ZHU, Ke LU. Object detection method based on radar and camera fusion[J]. Journal of Computer Applications, 2021, 41(11): 3242-3250.

图/表 15

图1 PRRPN的检测网络结构

Fig. 1 Detection network structure of PRRPN

图2 特征提取层网络结构

Fig. 2 Network structure of feature extraction layer

表1 ResNet50结构参数

Tab. 1 Structural parameters of ResNet50

层级名称	层级结构	Stride
conv1	7×7， 64	2
conv2_x	3×3 max pool，	2
conv2_x	$1 × 1, 64 3 × 3, 64 1 × 1, 256 × 3$	1
conv3_x	$1 × 1, 128 3 × 3, 128 1 × 1, 512 × 4$	1
conv4_x	$1 × 1, 256 3 × 3, 256 1 × 1, 1 024 × 6$	1
conv5_x	$1 × 1, 512 3 × 3, 512 1 × 1, 2 048 × 3$	1

表1 ResNet50结构参数

Tab. 1 Structural parameters of ResNet50

层级名称	层级结构	Stride
conv1	7×7， 64	2
conv2_x	3×3 max pool，	2
conv2_x	$1 × 1, 64 3 × 3, 64 1 × 1, 256 × 3$	1
conv3_x	$1 × 1, 128 3 × 3, 128 1 × 1, 512 × 4$	1
conv4_x	$1 × 1, 256 3 × 3, 256 1 × 1, 1 024 × 6$	1
conv5_x	$1 × 1, 512 3 × 3, 512 1 × 1, 2 048 × 3$	1

表2 全连接层参数

Tab. 2 Parameters of fully connected layer

层级名称	输入尺寸	输出尺寸
fc1	12 544	1 024
fc2	1 024	1 024
fc3（Bbox_Pred）	1 024	24
fc4（Class_Prob）	1 024	7

图3 雷达坐标系和相机坐标系示意图

Fig. 3 Schematic diagram of radar coordinate system and camera coordinate system

图4 关联门

Fig. 4 Association gate

图5 先验候选区域生成

Fig. 5 Generation of prior region proposals

图6 训练过程收敛曲线

Fig. 6 Convergence curves of training process

图7 PRRPN检测示意图

Fig. 7 Schematic diagram of PRRPN detection

表1 实验用评价指标及其含义

Tab. 1 Evaluation indexes for experiment and their meanings

指标	含义
AP	平均准确度，检测结果中正确结果所占比例
AP⁵⁰	IoU = 0.50的检测结果的AP
AP⁷⁵	IoU = 0.75的检测结果的AP
AP^S	面积 $≤ 322$ 的小目标的AP
AP^M	322 < 面积 < 962的中等目标的AP
AP^L	面积 $≥ 962$ 的大目标的AP
AR	平均召回率，测试集中所有正样本样例中被正确检测的比例
AR¹⁰	测试集每张图像中每10个目标中的最大召回的平均值
AR¹⁰⁰	测试集每张图像中每100个目标中的最大召回的平均值
AR^S	面积 $≤ 322$ 的小目标的AR
AR^M	322 < 面积 < 962的中等目标的AR
AR^L	面积 $≥ 962$ 的大目标的AR

表1 实验用评价指标及其含义

Tab. 1 Evaluation indexes for experiment and their meanings

指标	含义
AP	平均准确度，检测结果中正确结果所占比例
AP⁵⁰	IoU = 0.50的检测结果的AP
AP⁷⁵	IoU = 0.75的检测结果的AP
AP^S	面积 $≤ 322$ 的小目标的AP
AP^M	322 < 面积 < 962的中等目标的AP
AP^L	面积 $≥ 962$ 的大目标的AP
AR	平均召回率，测试集中所有正样本样例中被正确检测的比例
AR¹⁰	测试集每张图像中每10个目标中的最大召回的平均值
AR¹⁰⁰	测试集每张图像中每100个目标中的最大召回的平均值
AR^S	面积 $≤ 322$ 的小目标的AR
AR^M	322 < 面积 < 962的中等目标的AR
AR^L	面积 $≥ 962$ 的大目标的AR

表2 不同检测方法的AP (%)

Tab. 2 APs of different detection methods

候选框生成方法	AP	AP⁵⁰	AP⁷⁵	AP^S	AP^M	AP^L
PRRPN	34.49	60.95	35.05	7.75	24.03	46.28
RPN	36.69	66.99	36.72	5.66	28.76	47.74
PRRPN+RPN	37.03	64.90	38.54	5.90	29.17	47.68

表3 不同检测方法的AR

Tab. 3 ARs of different detection methods

候选框生成方法	AR	AR¹⁰	AR¹⁰⁰	AR^S	AR^M	AR^L
PRRPN	0.268	0.428	0.433	0.101	0.335	0.543
RPN	0.290	0.476	0.488	0.242	0.433	0.569
PRRPN+RPN	0.292	0.478	0.490	0.249	0.435	0.568

表4 不同检测方法检测到的各种类的AP (%)

Tab. 4 APs of different detection methods for different classes

候选框

生成方法

图8 PRRPN和RPN检测结果对比

Fig. 8 Comparison of PRRPN and RPN detection results

图9 PRRPN+RPN和RPN的检测结果对比

Fig. 9 Detection result comparison of PRRPN+RPN and RPN

参考文献 34

1	SOVIANY P， IONESCU R T. Optimizing the trade-off between single-stage and two-stage deep object detectors using image difficulty prediction ［C］// Proceedings of the 2018 20th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing. Piscataway： IEEE， 2018： 209-214. 10.1109/synasc.2018.00041
2	REDMON J， DIVVALA S， GIRSHICK R， et al. You only look once： unified， real-time object detection ［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 779-788. 10.1109/cvpr.2016.91
3	WEI L， ANGUELOV D， ERHAN D， et al. SSD： single shot multibox detector ［C］// Proceedings of the 2016 European Conference on Computer Vision， LNCS9905. Cham： Springer， 2016： 21-37.
4	REDMON J， FARHADI A. YOLO9000： better， faster， stronger ［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 6517-6525. 10.1109/cvpr.2017.690
5	REDMON J， FARHADI A. YOLOv3： an incremental improvement ［EB/OL］. （2018-04-08）［2020-12-04］.. 10.1109/cvpr.2018.00430
6	GIRSHICK R， DONAHUE J， DARRELL T， et al. Rich feature hierarchies for accurate object detection and semantic segmentation ［C］// Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2014： 580-587. 10.1109/cvpr.2014.81
7	GIRSHICK R. Fast R-CNN ［C］// Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2015： 1440-1448. 10.1109/iccv.2015.169
8	REN S Q， HE K M， GIRSHICK R， et al. Faster R-CNN： towards real-time object detection with region proposal networks ［C］// Proceedings of the 2015 28th International Conference on Neural Information Processing Systems. Cambridge： MIT Press， 2015：91-99.
9	HE K M， GKIOXARI G， DOLLÁR P， et al. Mask R-CNN ［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 2980-2988. 10.1109/iccv.2017.322
10	CAESAR H， BANKITI V， LANG A H， et al. nuScenes： a multimodal dataset for autonomous driving ［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 11618-11628. 10.1109/cvpr42600.2020.01164
11	ZHANG R Y， CAO S Y. Extending reliability of mmwave radar tracking and detection via fusion with camera ［J］. IEEE Access， 2019， 7： 137065-137079. 10.1109/access.2019.2942382
12	KIM T L， LEE J S， PARK T H， et al. Fusing lidar， radar， and camera using extended Kalman filter for estimating the forward position of vehicles ［C］// Proceedings of the 2019 IEEE International Conference on Cybernetics and Intelligent Systems/ IEEE Conference on Robotics， Automation and Mechatronics. Piscataway： IEEE， 2019： 374-379. 10.1109/cis-ram47153.2019.9095859
13	KIM K E， LEE C J， PAE D S， et al. Sensor fusion for vehicle tracking with camera and radar sensor ［C］// Proceedings of the 2017 17th International Conference on Control， Automation and Systems. Piscataway： IEEE， 2017： 1075-1077. 10.23919/iccas.2017.8204375
14	JANG Y S， PARK S K， LIM M T. Sensor fusion and compensation algorithm for vehicle tracking with front camera and corner radar sensors ［C］// Proceedings of the 2019 19th International Conference on Control， Automation and Systems. Piscataway： IEEE， 2019： 575-578. 10.23919/iccas47443.2019.8971685
15	JIANG Q Y， ZHANG L J， MENG D J. Target detection algorithm based on MMW radar and camera fusion ［C］// Proceedings of the 2019 IEEE Intelligent Transportation Systems Conference. Piscataway： IEEE， 2019： 1-6. 10.1109/itsc.2019.8917504
16	REN J X， WANG Y， HAN Y B， et al. Information fusion of digital camera and radar ［C］// Proceedings of the 2019 IEEE MTT-S International Microwave Biomedical Conference. Piscataway： IEEE， 2019： 1-4. 10.1109/imbioc.2019.8777799
17	JHA H， LODHI V， CHAKRAVARTY D. Object detection and identification using vision and radar data fusion system for ground-based navigation ［C］// Proceedings of the 2019 6th International Conference on Signal Processing and Integrated Networks. Piscataway： IEEE， 2019： 590-593. 10.1109/spin.2019.8711717
18	LEKIC V， BABIC Z. Automotive radar and camera fusion using generative adversarial networks ［J］. Computer Vision and Image Understanding， 2019， 184： 1-8. 10.1016/j.cviu.2019.04.002
19	NOBIS F， GEISSLINGER M， WEBER M， et al. A deep learning-based radar and camera sensor fusion architecture for object detection ［C］// Proceedings of the 2019 Sensor Data Fusion： Trends， Solutions， Applications. Piscataway： IEEE， 2019： 39-45. 10.1109/sdf.2019.8916629
20	CHADWICK S， MADDERN W， NEWMAN P. Distant vehicle detection using radar and vision ［C］// Proceedings of the 2019 International Conference on Robotics and Automation. Piscataway： IEEE， 2019： 8311-8317. 10.1109/icra.2019.8794312
21	MEYER M， KUSCHK G. Deep learning based 3D object detection for automotive radar and camera ［C］// Proceedings of the 2019 16th European Radar Conference. Piscataway： IEEE， 2019：133-136.
22	JI Z P， PROKHOROV D. Radar-vision fusion for object classification ［C］// Proceedings of the 2008 11th International Conference on Information Fusion. Piscataway： IEEE， 2008： 1-7.
23	KOCIĆ J， JOVIČIĆ N， DRNDAREVIĆ V. Sensors and sensor fusion in autonomous vehicles ［C］// Proceedings of the 2018 26th Telecommunications Forum. Piscataway： IEEE， 2018： 420-425. 10.1109/telfor.2018.8612054
24	HAN S Y， WANG X， XU L H， et al. Frontal object perception for intelligent vehicles based on radar and camera fusion ［C］// Proceedings of the 2016 35th Chinese Control Conference. Piscataway： IEEE， 2016： 4003-4008. 10.1109/chicc.2016.7553978
25	ZHANG X Y， ZHOU M， QIU P， et al. Radar and vision fusion for the real-time obstacle detection and identification ［J］. Industrial Robot， 2019， 46（3）： 391-395. 10.1108/ir-06-2018-0113
26	CHAVEZ-GARCIA R O， AYCARD O. Multiple sensor fusion and classification for moving object detection and tracking ［J］. IEEE Transactions on Intelligent Transportation Systems， 2016， 17（2）： 525-534. 10.1109/tits.2015.2479925
27	BAIG Q， AYCARD O， VU T D， et al. Fusion between laser and stereo vision data for moving objects tracking in intersection like scenario ［C］// Proceedings of the 2011 IEEE Intelligent Vehicles Symposium. Piscataway： IEEE， 2011： 362-367. 10.1109/ivs.2011.5940576
28	BAR-SHALOM Y， TSE E. Tracking in a cluttered environment with probabilistic data association ［J］. Automatica， 1975， 11（5）： 451-460. 10.1016/0005-1098(75)90021-7
29	BAR-SHALOM Y， DAUM F， HUANG J. The probabilistic data association filter ［J］. IEEE Control Systems Magazine， 2009， 29（6）： 82-100. 10.1109/mcs.2009.934469
30	NABATI R， QI H R. RRPN： radar region proposal network for object detection in autonomous vehicles ［C］// Proceedings of the 2019 IEEE International Conference on Image Processing. Piscataway： IEEE， 2019： 3093-3097. 10.1109/icip.2019.8803392
31	WU Y X， KIRILLOV A， MASSA F， et al. Detectron2 ［EB/OL］. ［2019-11-09］. .
32	LIN T Y， MAIRE M， BELONGIE S， et al. Microsoft COCO： common objects in context ［C］// Proceedings of the 2014 European Conference on Computer Vision， LNCS8693. Cham： Springer， 2014： 740-755.
33	罗俊海，王章静.多源数据融合和传感器管理［M］.北京：清华大学出版社，2015：6-11. 10.1002/ecs2.2015.6.issue-11
	LUO J H， WANG Z J. Multi-source Data Fusion and Sensor Management ［M］. Beijing： Tsinghua University Press， 2015： 6-11. 10.1002/ecs2.2015.6.issue-11
34	何友，修建娟，关欣.雷达数据处理及应用［M］.3版.北京：电子工业出版社，2013：87-90.
	HE Y， XIU J J， GUAN X. Radar Data Processing with Applications ［M］. 3rd ed. Beijing： Publishing House of Electronics Industry， 2013： 87-90.

[1]	马佳良, 陈斌, 孙晓飞. 基于改进的Faster R-CNN的通用目标检测框架[J]. 计算机应用, 2021, 41(9): 2712-2719.
[2]	刘子辰, 李小娟, 韦伟. 基于循环神经网络的专利价格自动评估[J]. 计算机应用, 2021, 41(9): 2532-2538.
[3]	王贺兵, 张春梅. 基于非对称卷积-压缩激发-次代残差网络的人脸关键点检测[J]. 计算机应用, 2021, 41(9): 2741-2747.
[4]	宋中山, 梁家锐, 郑禄, 刘振宇, 帖军. 基于双向门控尺度特征融合的遥感场景分类[J]. 计算机应用, 2021, 41(9): 2726-2735.
[5]	李康康, 张静. 基于注意力机制的多层次编码和解码的图像描述模型[J]. 计算机应用, 2021, 41(9): 2504-2509.
[6]	张永斌, 常文欣, 孙连山, 张航. 基于字典的域名生成算法生成域名的检测方法[J]. 计算机应用, 2021, 41(9): 2609-2614.
[7]	赵宏, 孔东一. 图像特征注意力与自适应注意力融合的图像内容中文描述[J]. 计算机应用, 2021, 41(9): 2496-2503.
[8]	徐江浪, 李林燕, 万新军, 胡伏原. 结合目标检测的室内场景识别方法[J]. 计算机应用, 2021, 41(9): 2720-2725.
[9]	牟长宁, 王海鹏, 周丕宇, 侯鑫行. 基于图卷积神经网络的串联质谱从头测序[J]. 计算机应用, 2021, 41(9): 2773-2779.
[10]	丁尹, 桑楠, 李晓瑜, 吴飞舟. 基于循环神经网络的电信行业容量数据预测方法[J]. 计算机应用, 2021, 41(8): 2373-2378.
[11]	秦斌斌, 彭良康, 卢向明, 钱江波. 司机分心驾驶检测研究进展[J]. 计算机应用, 2021, 41(8): 2330-2337.
[12]	黄程程, 董霄霄, 李钊. 基于二维Winograd算法的深流水线5×5卷积方法[J]. 计算机应用, 2021, 41(8): 2258-2264.
[13]	曾祥银, 郑伯川, 刘丹. 基于深度卷积神经网络和聚类的左右轨道线检测[J]. 计算机应用, 2021, 41(8): 2324-2329.
[14]	陈静, 毛莺池, 陈豪, 王龙宝, 王子成. 基于改进单点多盒检测器的大坝缺陷目标检测方法[J]. 计算机应用, 2021, 41(8): 2366-2372.
[15]	曹玉红, 徐海, 刘荪傲, 王紫霄, 李宏亮. 基于深度学习的医学影像分割研究综述[J]. 计算机应用, 2021, 41(8): 2273-2287.

基于雷达和相机融合的目标检测方法

Object detection method based on radar and camera fusion

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 15

参考文献 34

相关文章 15

编辑推荐

Metrics