Object detection method based on radar and camera fusion

doi:10.11772/j.issn.1001-9081.2021020327

Journal of Computer Applications ›› 2021, Vol. 41 ›› Issue (11): 3242-3250.DOI: 10.11772/j.issn.1001-9081.2021020327

Special Issue: 人工智能

• Artificial intelligence • Previous Articles Next Articles

Object detection method based on radar and camera fusion

Jie GAO¹, Yuan ZHU²(), Ke LU²

^1.Chinesisch-Deutsches Hochschulkolleg，Tongji University，Shanghai 200092，China
^2.School of Automotive Studies，Tongji University，Shanghai 201804，China

Received:2021-03-05 Revised:2021-04-15 Accepted:2021-04-20 Online:2021-04-29 Published:2021-11-10
Contact: Yuan ZHU
About author:GAO Jie，born in 1996，M. S. candidate. Her research interests include multi-sensor fusion，multi-object tracking，object detection
ZHU Yuan， born in 1976， Ph. D.， associate professor. His research interests include electrical drive system of new energy vehicles， embedded software for automotive electronics，multi-sensor fusion for intelligent driving
LU Ke，born in 1983，Ph. D.，engineer. His research interests include automotive embedded system for automotive electronics， perception algorithms for autonomous vehicles，functional safety.

基于雷达和相机融合的目标检测方法

高洁¹, 朱元²(), 陆科²

^1.同济大学中德学院，上海 200092
^2.同济大学汽车学院，上海 201804

通讯作者: 朱元
作者简介:高洁（1996—），女，贵州六盘水人，硕士研究生，主要研究方向：多传感器融合、多目标跟踪、目标检测
朱元（1976—），男，江苏泰州人，副教授，博士，主要研究方向：新能源汽车电气驱动系统、汽车电子嵌入式软件、智能驾驶多传感器融合
陆科（1983—），男，江苏常州人，工程师，博士，主要研究方向：汽车电子嵌入式系统、自动驾驶汽车感知算法、功能安全。

Abstract

Abstract:

In the automatic driving perception system， multi-sensor fusion is usually used to improve the reliability of the perception results. Aiming at the task of object detection in fusion perception system， a object detection method based on radar and camera fusion， namely Priori and Radar Region Proposal Network （PRRPN）， was proposed，with the aim of using radar measurement and the object detection result of the previous frame to improve the generation of region proposals in the image detection network and improve the object detection performance. Firstly， the objects detected in the previous frame with the radar points in the current frame were associated to pre-classify the radar points. Then， the pre-classified radar points were projected into the image， and the corresponding prior region proposals and radar region proposals were obtained according to the distance of the radar and Radar Cross Section （RCS） information. Finally， the regression and classification of the object bounding boxes were performed according to the region proposals. In addition， PRRPN and Region Proposal Network （RPN） were fused to carry out object detection. The newly released nuScenes dataset was adopted to test and evaluate the three detection methods. Experimental results show that， compared with RPN， the proposed PRRPN can not only detect objects faster， but also increase the average detection accuracy of small objects by 2.09 percentage points. And compared with the methods by using PRRPN and RPN alone， the method by fusing the proposed PRRPN and RPN has the average detection accuracy increased by 2.54 percentage points and 0.34 percentage points respectively.

Key words: object detection, neural network, sensor fusion, radar, camera

摘要：

在自动驾驶感知系统中，为了提高感知结果的可靠度，通常采用多传感器融合的方法。针对融合感知系统中的目标检测任务，提出了基于雷达和相机融合的目标检测方法——PRRPN，旨在使用雷达测量和前一帧目标检测结果来改进图像检测网络中的候选区域生成，并提高目标检测性能。首先，将前一帧检测到的目标与当前帧中的雷达点进行关联，以实现雷达预分类。然后，将预分类后的雷达点投影到图像中，并根据雷达的距离和雷达散射截面积（RCS）信息获得相应的先验候选区域和雷达候选区域。最后，根据候选区域进行目标边界框的回归和分类。此外，还将PRRPN与区域生成网络（RPN）融合到一起来进行目标检测。使用新发布的nuScenes数据集来对三种检测方法进行测试评估。实验结果表明，与RPN相比，PRRPN不仅可以更快速地实现目标检测，而且还使得小目标的平均检测精度提升了2.09个百分点；而将所提PRRPN与RPN进行融合的方法，与单独使用PRRPN和RPN相比，平均检测精度分别提升了2.54个百分点和0.34个百分点。

关键词: 目标检测, 神经网络, 传感器融合, 雷达, 相机

CLC Number:

TP391.4

Jie GAO, Yuan ZHU, Ke LU. Object detection method based on radar and camera fusion[J]. Journal of Computer Applications, 2021, 41(11): 3242-3250.

高洁, 朱元, 陆科. 基于雷达和相机融合的目标检测方法[J]. 《计算机应用》唯一官方网站, 2021, 41(11): 3242-3250.

Figures/Tables 15

Fig. 1 Detection network structure of PRRPN

Fig. 2 Network structure of feature extraction layer

Tab. 1 Structural parameters of ResNet50

层级名称	层级结构	Stride
conv1	7×7， 64	2
conv2_x	3×3 max pool，	2
conv2_x	$1 × 1, 64 3 × 3, 64 1 × 1, 256 × 3$	1
conv3_x	$1 × 1, 128 3 × 3, 128 1 × 1, 512 × 4$	1
conv4_x	$1 × 1, 256 3 × 3, 256 1 × 1, 1 024 × 6$	1
conv5_x	$1 × 1, 512 3 × 3, 512 1 × 1, 2 048 × 3$	1

Tab. 1 Structural parameters of ResNet50

层级名称	层级结构	Stride
conv1	7×7， 64	2
conv2_x	3×3 max pool，	2
conv2_x	$1 × 1, 64 3 × 3, 64 1 × 1, 256 × 3$	1
conv3_x	$1 × 1, 128 3 × 3, 128 1 × 1, 512 × 4$	1
conv4_x	$1 × 1, 256 3 × 3, 256 1 × 1, 1 024 × 6$	1
conv5_x	$1 × 1, 512 3 × 3, 512 1 × 1, 2 048 × 3$	1

Tab. 2 Parameters of fully connected layer

层级名称	输入尺寸	输出尺寸
fc1	12 544	1 024
fc2	1 024	1 024
fc3（Bbox_Pred）	1 024	24
fc4（Class_Prob）	1 024	7

Fig. 3 Schematic diagram of radar coordinate system and camera coordinate system

Fig. 4 Association gate

Fig. 5 Generation of prior region proposals

Fig. 6 Convergence curves of training process

Fig. 7 Schematic diagram of PRRPN detection

Tab. 1 Evaluation indexes for experiment and their meanings

指标	含义
AP	平均准确度，检测结果中正确结果所占比例
AP⁵⁰	IoU = 0.50的检测结果的AP
AP⁷⁵	IoU = 0.75的检测结果的AP
AP^S	面积 $≤ 322$ 的小目标的AP
AP^M	322 < 面积 < 962的中等目标的AP
AP^L	面积 $≥ 962$ 的大目标的AP
AR	平均召回率，测试集中所有正样本样例中被正确检测的比例
AR¹⁰	测试集每张图像中每10个目标中的最大召回的平均值
AR¹⁰⁰	测试集每张图像中每100个目标中的最大召回的平均值
AR^S	面积 $≤ 322$ 的小目标的AR
AR^M	322 < 面积 < 962的中等目标的AR
AR^L	面积 $≥ 962$ 的大目标的AR

Tab. 1 Evaluation indexes for experiment and their meanings

指标	含义
AP	平均准确度，检测结果中正确结果所占比例
AP⁵⁰	IoU = 0.50的检测结果的AP
AP⁷⁵	IoU = 0.75的检测结果的AP
AP^S	面积 $≤ 322$ 的小目标的AP
AP^M	322 < 面积 < 962的中等目标的AP
AP^L	面积 $≥ 962$ 的大目标的AP
AR	平均召回率，测试集中所有正样本样例中被正确检测的比例
AR¹⁰	测试集每张图像中每10个目标中的最大召回的平均值
AR¹⁰⁰	测试集每张图像中每100个目标中的最大召回的平均值
AR^S	面积 $≤ 322$ 的小目标的AR
AR^M	322 < 面积 < 962的中等目标的AR
AR^L	面积 $≥ 962$ 的大目标的AR

Tab. 2 APs of different detection methods

候选框生成方法	AP	AP⁵⁰	AP⁷⁵	AP^S	AP^M	AP^L
PRRPN	34.49	60.95	35.05	7.75	24.03	46.28
RPN	36.69	66.99	36.72	5.66	28.76	47.74
PRRPN+RPN	37.03	64.90	38.54	5.90	29.17	47.68

Tab. 3 ARs of different detection methods

候选框生成方法	AR	AR¹⁰	AR¹⁰⁰	AR^S	AR^M	AR^L
PRRPN	0.268	0.428	0.433	0.101	0.335	0.543
RPN	0.290	0.476	0.488	0.242	0.433	0.569
PRRPN+RPN	0.292	0.478	0.490	0.249	0.435	0.568

Tab. 4 APs of different detection methods for different classes

候选框

生成方法

Fig. 8 Comparison of PRRPN and RPN detection results

Fig. 9 Detection result comparison of PRRPN+RPN and RPN

References 34

1	SOVIANY P， IONESCU R T. Optimizing the trade-off between single-stage and two-stage deep object detectors using image difficulty prediction ［C］// Proceedings of the 2018 20th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing. Piscataway： IEEE， 2018： 209-214. 10.1109/synasc.2018.00041
2	REDMON J， DIVVALA S， GIRSHICK R， et al. You only look once： unified， real-time object detection ［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 779-788. 10.1109/cvpr.2016.91
3	WEI L， ANGUELOV D， ERHAN D， et al. SSD： single shot multibox detector ［C］// Proceedings of the 2016 European Conference on Computer Vision， LNCS9905. Cham： Springer， 2016： 21-37.
4	REDMON J， FARHADI A. YOLO9000： better， faster， stronger ［C］// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2017： 6517-6525. 10.1109/cvpr.2017.690
5	REDMON J， FARHADI A. YOLOv3： an incremental improvement ［EB/OL］. （2018-04-08）［2020-12-04］.. 10.1109/cvpr.2018.00430
6	GIRSHICK R， DONAHUE J， DARRELL T， et al. Rich feature hierarchies for accurate object detection and semantic segmentation ［C］// Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2014： 580-587. 10.1109/cvpr.2014.81
7	GIRSHICK R. Fast R-CNN ［C］// Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2015： 1440-1448. 10.1109/iccv.2015.169
8	REN S Q， HE K M， GIRSHICK R， et al. Faster R-CNN： towards real-time object detection with region proposal networks ［C］// Proceedings of the 2015 28th International Conference on Neural Information Processing Systems. Cambridge： MIT Press， 2015：91-99.
9	HE K M， GKIOXARI G， DOLLÁR P， et al. Mask R-CNN ［C］// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway： IEEE， 2017： 2980-2988. 10.1109/iccv.2017.322
10	CAESAR H， BANKITI V， LANG A H， et al. nuScenes： a multimodal dataset for autonomous driving ［C］// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2020： 11618-11628. 10.1109/cvpr42600.2020.01164
11	ZHANG R Y， CAO S Y. Extending reliability of mmwave radar tracking and detection via fusion with camera ［J］. IEEE Access， 2019， 7： 137065-137079. 10.1109/access.2019.2942382
12	KIM T L， LEE J S， PARK T H， et al. Fusing lidar， radar， and camera using extended Kalman filter for estimating the forward position of vehicles ［C］// Proceedings of the 2019 IEEE International Conference on Cybernetics and Intelligent Systems/ IEEE Conference on Robotics， Automation and Mechatronics. Piscataway： IEEE， 2019： 374-379. 10.1109/cis-ram47153.2019.9095859
13	KIM K E， LEE C J， PAE D S， et al. Sensor fusion for vehicle tracking with camera and radar sensor ［C］// Proceedings of the 2017 17th International Conference on Control， Automation and Systems. Piscataway： IEEE， 2017： 1075-1077. 10.23919/iccas.2017.8204375
14	JANG Y S， PARK S K， LIM M T. Sensor fusion and compensation algorithm for vehicle tracking with front camera and corner radar sensors ［C］// Proceedings of the 2019 19th International Conference on Control， Automation and Systems. Piscataway： IEEE， 2019： 575-578. 10.23919/iccas47443.2019.8971685
15	JIANG Q Y， ZHANG L J， MENG D J. Target detection algorithm based on MMW radar and camera fusion ［C］// Proceedings of the 2019 IEEE Intelligent Transportation Systems Conference. Piscataway： IEEE， 2019： 1-6. 10.1109/itsc.2019.8917504
16	REN J X， WANG Y， HAN Y B， et al. Information fusion of digital camera and radar ［C］// Proceedings of the 2019 IEEE MTT-S International Microwave Biomedical Conference. Piscataway： IEEE， 2019： 1-4. 10.1109/imbioc.2019.8777799
17	JHA H， LODHI V， CHAKRAVARTY D. Object detection and identification using vision and radar data fusion system for ground-based navigation ［C］// Proceedings of the 2019 6th International Conference on Signal Processing and Integrated Networks. Piscataway： IEEE， 2019： 590-593. 10.1109/spin.2019.8711717
18	LEKIC V， BABIC Z. Automotive radar and camera fusion using generative adversarial networks ［J］. Computer Vision and Image Understanding， 2019， 184： 1-8. 10.1016/j.cviu.2019.04.002
19	NOBIS F， GEISSLINGER M， WEBER M， et al. A deep learning-based radar and camera sensor fusion architecture for object detection ［C］// Proceedings of the 2019 Sensor Data Fusion： Trends， Solutions， Applications. Piscataway： IEEE， 2019： 39-45. 10.1109/sdf.2019.8916629
20	CHADWICK S， MADDERN W， NEWMAN P. Distant vehicle detection using radar and vision ［C］// Proceedings of the 2019 International Conference on Robotics and Automation. Piscataway： IEEE， 2019： 8311-8317. 10.1109/icra.2019.8794312
21	MEYER M， KUSCHK G. Deep learning based 3D object detection for automotive radar and camera ［C］// Proceedings of the 2019 16th European Radar Conference. Piscataway： IEEE， 2019：133-136.
22	JI Z P， PROKHOROV D. Radar-vision fusion for object classification ［C］// Proceedings of the 2008 11th International Conference on Information Fusion. Piscataway： IEEE， 2008： 1-7.
23	KOCIĆ J， JOVIČIĆ N， DRNDAREVIĆ V. Sensors and sensor fusion in autonomous vehicles ［C］// Proceedings of the 2018 26th Telecommunications Forum. Piscataway： IEEE， 2018： 420-425. 10.1109/telfor.2018.8612054
24	HAN S Y， WANG X， XU L H， et al. Frontal object perception for intelligent vehicles based on radar and camera fusion ［C］// Proceedings of the 2016 35th Chinese Control Conference. Piscataway： IEEE， 2016： 4003-4008. 10.1109/chicc.2016.7553978
25	ZHANG X Y， ZHOU M， QIU P， et al. Radar and vision fusion for the real-time obstacle detection and identification ［J］. Industrial Robot， 2019， 46（3）： 391-395. 10.1108/ir-06-2018-0113
26	CHAVEZ-GARCIA R O， AYCARD O. Multiple sensor fusion and classification for moving object detection and tracking ［J］. IEEE Transactions on Intelligent Transportation Systems， 2016， 17（2）： 525-534. 10.1109/tits.2015.2479925
27	BAIG Q， AYCARD O， VU T D， et al. Fusion between laser and stereo vision data for moving objects tracking in intersection like scenario ［C］// Proceedings of the 2011 IEEE Intelligent Vehicles Symposium. Piscataway： IEEE， 2011： 362-367. 10.1109/ivs.2011.5940576
28	BAR-SHALOM Y， TSE E. Tracking in a cluttered environment with probabilistic data association ［J］. Automatica， 1975， 11（5）： 451-460. 10.1016/0005-1098(75)90021-7
29	BAR-SHALOM Y， DAUM F， HUANG J. The probabilistic data association filter ［J］. IEEE Control Systems Magazine， 2009， 29（6）： 82-100. 10.1109/mcs.2009.934469
30	NABATI R， QI H R. RRPN： radar region proposal network for object detection in autonomous vehicles ［C］// Proceedings of the 2019 IEEE International Conference on Image Processing. Piscataway： IEEE， 2019： 3093-3097. 10.1109/icip.2019.8803392
31	WU Y X， KIRILLOV A， MASSA F， et al. Detectron2 ［EB/OL］. ［2019-11-09］. .
32	LIN T Y， MAIRE M， BELONGIE S， et al. Microsoft COCO： common objects in context ［C］// Proceedings of the 2014 European Conference on Computer Vision， LNCS8693. Cham： Springer， 2014： 740-755.
33	罗俊海，王章静.多源数据融合和传感器管理［M］.北京：清华大学出版社，2015：6-11. 10.1002/ecs2.2015.6.issue-11
	LUO J H， WANG Z J. Multi-source Data Fusion and Sensor Management ［M］. Beijing： Tsinghua University Press， 2015： 6-11. 10.1002/ecs2.2015.6.issue-11
34	何友，修建娟，关欣.雷达数据处理及应用［M］.3版.北京：电子工业出版社，2013：87-90.
	HE Y， XIU J J， GUAN X. Radar Data Processing with Applications ［M］. 3rd ed. Beijing： Publishing House of Electronics Industry， 2013： 87-90.

[1]	Jing QIN, Zhiguang QIN, Fali LI, Yueheng PENG. Diagnosis of major depressive disorder based on probabilistic sparse self-attention neural network [J]. Journal of Computer Applications, 2024, 44(9): 2970-2974.
[2]	Hang YANG, Wanggen LI, Gensheng ZHANG, Zhige WANG, Xin KAI. Multi-layer information interactive fusion algorithm based on graph neural network for session-based recommendation [J]. Journal of Computer Applications, 2024, 44(9): 2719-2725.
[3]	Xingyao YANG, Yu CHEN, Jiong YU, Zulian ZHANG, Jiaying CHEN, Dongxiao WANG. Recommendation model combining self-features and contrastive learning [J]. Journal of Computer Applications, 2024, 44(9): 2704-2710.
[4]	Na WANG, Lin JIANG, Yuancheng LI, Yun ZHU. Optimization of tensor virtual machine operator fusion based on graph rewriting and fusion exploration [J]. Journal of Computer Applications, 2024, 44(9): 2802-2809.
[5]	Yexin PAN, Zhe YANG. Optimization model for small object detection based on multi-level feature bidirectional fusion [J]. Journal of Computer Applications, 2024, 44(9): 2871-2877.
[6]	Yun LI, Fuyou WANG, Peiguang JING, Su WANG, Ao XIAO. Uncertainty-based frame associated short video event detection method [J]. Journal of Computer Applications, 2024, 44(9): 2903-2910.
[7]	Tingjie TANG, Jiajin HUANG, Jin QIN. Session-based recommendation with graph auxiliary learning [J]. Journal of Computer Applications, 2024, 44(9): 2711-2718.
[8]	Rui ZHANG, Pengyun ZHANG, Meirong GAO. Self-optimized dual-modal multi-channel non-deep vestibular schwannoma recognition model [J]. Journal of Computer Applications, 2024, 44(9): 2975-2982.
[9]	Jinjin LI, Guoming SANG, Yijia ZHANG. Multi-domain fake news detection model enhanced by APK-CNN and Transformer [J]. Journal of Computer Applications, 2024, 44(9): 2674-2682.
[10]	Yu DU, Yan ZHU. Constructing pre-trained dynamic graph neural network to predict disappearance of academic cooperation behavior [J]. Journal of Computer Applications, 2024, 44(9): 2726-2731.
[11]	Guanglei YAO, Juxia XIONG, Guowu YANG. Flower pollination algorithm based on neural network optimization [J]. Journal of Computer Applications, 2024, 44(9): 2829-2837.
[12]	Ying HUANG, Jiayu YANG, Jiahao JIN, Bangrui WAN. Siamese mixed information fusion algorithm for RGBT tracking [J]. Journal of Computer Applications, 2024, 44(9): 2878-2885.
[13]	Yeheng LI, Guangsheng LUO, Qianmin SU. Logo detection algorithm based on improved YOLOv5 [J]. Journal of Computer Applications, 2024, 44(8): 2580-2587.
[14]	Hong CHEN, Bing QI, Haibo JIN, Cong WU, Li’ang ZHANG. Class-imbalanced traffic abnormal detection based on 1D-CNN and BiGRU [J]. Journal of Computer Applications, 2024, 44(8): 2493-2499.
[15]	Ying YANG, Xiaoyan HAO, Dan YU, Yao MA, Yongle CHEN. Graph data generation approach for graph neural network model extraction attacks [J]. Journal of Computer Applications, 2024, 44(8): 2483-2492.

Object detection method based on radar and camera fusion

基于雷达和相机融合的目标检测方法

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 15

References 34

Related Articles 15

Recommended Articles

Metrics