Dynamic visual SLAM algorithm incorporating object detection and feature point association

doi:10.11772/j.issn.1001-9081.2024020227

Abstract

Abstract:

Aiming at the problem that dynamic objects interfere with the normal operation of Simultaneous Localization And Mapping （SLAM） system seriously， a dynamic visual SLAM algorithm based on object detection and feature point association was proposed. Firstly， the YOLOv5 （You Only Look Once version 5） object detection network was used to obtain information about potential dynamic objects in environment， and the missed detection of the image was compensated on the basis of simple target tracking. Secondly， in order to solve the problem that the geometric constraint method of single feature point is prone to misjudgment， the feature point association was established according to the positional information and optical flow information of the image， and then combined with the epipolar constraint， dynamics of the relation network was judged. Thirdly， the two methods were combined to eliminate dynamic feature points in the image， and the remaining static feature points were weighted to estimate the camera pose. Finally， a dense point cloud map was established for the static environment. Experimental results of comparison and ablation on TUM （Technical University of Munich） public dataset demonstrate that the Root Mean Square Error （RMSE） in Absolute Trajectory Error （ATE） of the proposed algorithm is reduced by at least 95.22% and 5.61% respectively compared to ORB-SLAM2 and DS-SLAM （Dynamic Semantic SLAM） in highly dynamic scenarios. It can be seen that the proposed algorithm can improve accuracy and robustness while ensuring real-time performance.

Key words: dynamic environment, object detection, Simultaneous Localization And Mapping (SLAM), dense point cloud map, optical flow method

摘要：

针对动态物体严重干扰同时定位与建图（SLAM）系统正常运行的问题，提出一种基于目标检测和特征点关联的动态视觉SLAM算法。首先，利用YOLOv5目标检测网络得到环境中潜在动态物体的信息，并基于简易目标跟踪对图像漏检进行补偿；其次，为解决单一特征点的几何约束方法易出现误判的问题，依据图像的位置信息和光流信息建立特征点关联，再结合极线约束判断关系网的动态性；再次，结合两种方法剔除图像中的动态特征点，并用剩余的静态特征点加权估计位姿；最后，对静态环境建立稠密点云地图。在TUM（Technical University of Munich）公开数据集上的对比和消融实验的结果表明，与ORB-SLAM2和DS-SLAM（Dynamic Semantic SLAM）相比，所提算法在高动态场景下的绝对轨迹误差（ATE）中的均方根误差（RMSE）分别至少降低了95.22%和5.61%。可见，所提算法在保证实时性的同时提高了准确性和鲁棒性。

关键词: 动态环境, 目标检测, 同时定位与建图, 稠密点云地图, 光流法

CLC Number:

TP242.6

Shijia WEN, Shijun JING. Dynamic visual SLAM algorithm incorporating object detection and feature point association[J]. Journal of Computer Applications, 2025, 45(2): 610-615.

文诗佳, 金世俊. 结合目标检测和特征点关联的动态视觉SLAM算法[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 610-615.

Figures/Tables 15

References 23

1	CADENA C， CARLONE L， CARRILLO H， et al. Past， present， and future of simultaneous localization and mapping： toward the robust-perception age［J］. IEEE Transactions on Robotics， 2016， 32（6）： 1309-1332.
2	DAVISON A J， REID I D， MOLTON N D， et al. MonoSLAM： real-time single camera SLAM［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2007， 29（6）： 1052-1067.
3	FORSTER C， PIZZOLI M， SCARAMUZZA D. SVO： fast semi-direct monocular visual odometry［C］// Proceedings of the 2014 IEEE International Conference on Robotics and Automation. Piscataway： IEEE， 2014： 15-22.
4	ENGEL J， KOLTUN V， CREMERS D. Direct sparse odometry［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2018， 40（3）： 611-625.
5	MUR-ARTAL R， TARDÓS J D. ORB-SLAM2： an open-source SLAM system for monocular， stereo and RGB-D cameras［J］. IEEE Transactions on Robotics， 2017， 33（5）： 1255-1262.
6	SINGH G， WU M， DO M V， et al. Fast semantic-aware motion state detection for visual SLAM in dynamic environment［J］. IEEE Transactions on Intelligent Transportation Systems， 2022， 23（12）： 23014-23030.
7	ZHANG C， ZHANG R， JIN S， et al. PFD-SLAM： a new RGB-D SLAM for dynamic indoor environments based on non-prior semantic segmentation［J］. Remote Sensing， 2022， 14（10）： No.2445.
8	SUN Y， LIU M， MENG M Q H. Improving RGB-D SLAM in dynamic environments： a motion removal approach［J］. Robotics and Autonomous Systems， 2017， 89： 110-122.
9	LI S， LEE D. RGB-D SLAM in dynamic environments using static point weighting［J］. IEEE Robotics and Automation Letters， 2017， 2（4）： 2263-2270.
10	黄泽霞，邵春莉. 深度学习下的视觉SLAM综述［J］. 机器人， 2023， 45（6）：756-768.
	HUANG Z X， SHAO C L. Survey of visual SLAM based on deep learning［J］. Robot， 2023， 45（6）：756-768.
11	席志红，温家旭. 基于目标检测的室内动态场景定位与建图［J］. 计算机应用， 2022， 42（9）： 2853-2857.
	XI Z H， WEN J X. Indoor dynamic scene localization and mapping based on target detection［J］. Journal of Computer Applications， 2022， 42（9）： 2853-2857.
12	LIU J， LI X， LIU Y， et al. RGB-D inertial odometry for a resource-restricted robot in dynamic environments［J］. IEEE Robotics and Automation Letters， 2022， 7（4）： 9573-9580.
13	刘丰宇，程向红，曹毅. 基于深度学习与特征点速度约束的室内动态SLAM方法［J］. 中国惯性技术学报， 2023， 31（5）： 438-443.
	LIU F Y， CHENG X H， CAO Y. An indoor dynamic SLAM method based on deep learning and feature point velocity constraint［J］. Journal of Chinese Inertial Technology， 2023， 31（5）：438-443.
14	BESCOS B， FÁCIL J M， CIVERA J， et al. DynaSLAM： tracking， mapping and inpainting in dynamic scenes［J］. IEEE Robotics and Automation Letters， 2018， 3（4）： 4076-4083.
15	HE K， GKIOXARI G， DOLLÁR P， et al. Mask R-CNN［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2020， 42（2）： 386-397.
16	YU C， LIU Z， LIU X J， et al. DS-SLAM： a semantic visual SLAM towards dynamic environments［C］// Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway： IEEE， 2018： 1168-1174.
17	BADRINARAYANAN V， KENDALL A， CIPOLLA R. SegNet： a deep convolutional encoder-decoder architecture for image segmentation［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2017， 39（12）： 2481-2495.
18	肖田邹子，周小博，罗欣，等. 动态环境下结合实例分割与聚类的鲁棒RGB-D SLAM系统［J］. 计算机应用， 2023， 43（4）： 1220-1225.
	XIAO T Z Z， ZHOU X B， LUO X， et al. Robust RGB-D SLAM system incorporating instance segmentation and clustering in dynamic environment［J］. Journal of Computer Applications， 2023， 43（4）： 1220-1225.
19	XU Z， RONG Z， WU Y. A survey： which features are required for dynamic visual simultaneous localization and mapping？［J］. Visual Computing for Industry， Biomedicine， and Art， 2021， 4： No.20.
20	Ultralytics. YOLOv5［EB/OL］. ［2024-01-24］..
21	LIN T Y， MAIRE M， BELONGIE S，et al. Microsoft COCO： common objects in context［C］// Proceedings of the 2014 European Conference on Computer Vision， LNCS 8693. Cham： Springer， 2014：740-755.
22	STURM J， ENGELHARD N， ENDRES F， et al. A benchmark for the evaluation of RGB-D SLAM systems［C］// Proceedings of the 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway： IEEE， 2012： 573-580.
23	GRUPP M. EVO： Python package for the evaluation of odometry and SLAM［EB/OL］. ［2024-01-24］..

动态程度	数据集	绝对轨迹误差/m				性能提升/%
		ORB-SLAM2		本文算法		性能提升/%
		RMSE	STD	RMSE	STD	RMSE	STD
静态	fr1_xyz	0.009 8	0.005 3	0.009 9	0.005 2	-1.02	1.89
静态	fr1_desk	0.015 8	0.009 2	0.015 8	0.009 4	0.00	-2.17
低动态	fr3_sitting_xyz	0.009 6	0.004 8	0.009 3	0.004 6	3.13	4.17
低动态	fr3_sitting_static	0.008 6	0.004 2	0.007 0	0.003 5	18.60	16.67
高动态	fr3_walking_rpy	0.903 1	0.492 7	0.032 8	0.018 3	96.37	96.29
	fr3_walking_xyz	0.735 7	0.424 0	0.014 3	0.006 7	98.06	98.42
	fr3_walking_static	0.383 5	0.134 9	0.007 4	0.003 4	98.07	97.48
	fr3_walking_halfsphere	0.598 4	0.300 2	0.028 6	0.014 2	95.22	95.27

动态程度	数据集	绝对轨迹误差/m				性能提升/%
		ORB-SLAM2		本文算法		性能提升/%
		RMSE	STD	RMSE	STD	RMSE	STD
静态	fr1_xyz	0.009 8	0.005 3	0.009 9	0.005 2	-1.02	1.89
静态	fr1_desk	0.015 8	0.009 2	0.015 8	0.009 4	0.00	-2.17
低动态	fr3_sitting_xyz	0.009 6	0.004 8	0.009 3	0.004 6	3.13	4.17
低动态	fr3_sitting_static	0.008 6	0.004 2	0.007 0	0.003 5	18.60	16.67
高动态	fr3_walking_rpy	0.903 1	0.492 7	0.032 8	0.018 3	96.37	96.29
	fr3_walking_xyz	0.735 7	0.424 0	0.014 3	0.006 7	98.06	98.42
	fr3_walking_static	0.383 5	0.134 9	0.007 4	0.003 4	98.07	97.48
	fr3_walking_halfsphere	0.598 4	0.300 2	0.028 6	0.014 2	95.22	95.27

动态程度	数据集	本文算法			ORBSLAM2+YOLOv5s			ORB-SLAM2+几何约束（含特征关联）			ORB-SLAM2+YOLOv5s+ 几何约束（不含特征关联）
动态程度	数据集	ATE/m	R.RPE/（°）	T.RPE /m	ATE/m	R.RPE /（°）	T.RPE /m	ATE/m	R.RPE /（°）	T.RPE /m	ATE/m	R.RPE /（°）	T.RPE /m
静态	fr1_xyz	0.009 9	0.387 9	0.006 0	0.010 2	0.405 7	0.006 2	0.009 9	0.398 4	0.006 0	0.010 1	0.386 9	0.005 9
静态	fr1_desk	0.015 8	0.563 6	0.009 5	0.016 2	0.572 3	0.009 4	0.015 0	0.575 6	0.009 3	0.015 0	0.560 3	0.009 3
低动态	fr3_sitting_xyz	0.009 3	0.315 9	0.008 5	0.009 5	0.314 4	0.008 6	0.017 2	0.328 2	0.009 9	0.015 1	0.327 5	0.009 6
低动态	fr3_sitting_static	0.007 0	0.165 2	0.005 3	0.005 8	0.150 3	0.004 4	0.006 1	0.150 3	0.004 5	0.006 0	0.151 6	0.004 7
高动态	fr3_walking_xyz	0.014 3	0.385 4	0.011 5	0.015 1	0.389 3	0.016 1	0.242 3	0.600 4	0.133 3	0.018 0	0.404 5	0.022 7
	fr3_walking_rpy	0.032 8	0.500 0	0.021 3	0.044 1	0.526 5	0.045 7	0.228 3	0.541 9	0.061 8	0.156 5	0.576 4	0.156 5
	fr3_walking_static	0.007 4	0.176 0	0.006 0	0.009 8	0.187 2	0.009 9	0.009 9	0.206 9	0.009 6	0.010 1	0.201 8	0.010 1
	fr3_walking_halfsphere	0.028 6	0.418 0	0.013 8	0.033 0	0.420 6	0.025 8	0.029 4	0.485 6	0.030 9	0.036 8	0.428 5	0.023 9

动态程度	数据集	本文算法			ORBSLAM2+YOLOv5s			ORB-SLAM2+几何约束（含特征关联）			ORB-SLAM2+YOLOv5s+ 几何约束（不含特征关联）
动态程度	数据集	ATE/m	R.RPE/（°）	T.RPE /m	ATE/m	R.RPE /（°）	T.RPE /m	ATE/m	R.RPE /（°）	T.RPE /m	ATE/m	R.RPE /（°）	T.RPE /m
静态	fr1_xyz	0.009 9	0.387 9	0.006 0	0.010 2	0.405 7	0.006 2	0.009 9	0.398 4	0.006 0	0.010 1	0.386 9	0.005 9
静态	fr1_desk	0.015 8	0.563 6	0.009 5	0.016 2	0.572 3	0.009 4	0.015 0	0.575 6	0.009 3	0.015 0	0.560 3	0.009 3
低动态	fr3_sitting_xyz	0.009 3	0.315 9	0.008 5	0.009 5	0.314 4	0.008 6	0.017 2	0.328 2	0.009 9	0.015 1	0.327 5	0.009 6
低动态	fr3_sitting_static	0.007 0	0.165 2	0.005 3	0.005 8	0.150 3	0.004 4	0.006 1	0.150 3	0.004 5	0.006 0	0.151 6	0.004 7
高动态	fr3_walking_xyz	0.014 3	0.385 4	0.011 5	0.015 1	0.389 3	0.016 1	0.242 3	0.600 4	0.133 3	0.018 0	0.404 5	0.022 7
	fr3_walking_rpy	0.032 8	0.500 0	0.021 3	0.044 1	0.526 5	0.045 7	0.228 3	0.541 9	0.061 8	0.156 5	0.576 4	0.156 5
	fr3_walking_static	0.007 4	0.176 0	0.006 0	0.009 8	0.187 2	0.009 9	0.009 9	0.206 9	0.009 6	0.010 1	0.201 8	0.010 1
	fr3_walking_halfsphere	0.028 6	0.418 0	0.013 8	0.033 0	0.420 6	0.025 8	0.029 4	0.485 6	0.030 9	0.036 8	0.428 5	0.023 9

动态程度	数据集	本文系统		Dynamic-VINS^［12］		DynaSLAM^［14］		DS-SLAM^［16］		ISC-SLAM^［18］
动态程度	数据集	RMSE	STD	RMSE	STD	RMSE	STD	RMSE	STD	RMSE	STD
低动态	fr3_sitting_xyz	0.009 6	0.004 8			0.015 0	0.006 5			0.020 5	0.009 8
低动态	fr3_sitting_static	0.008 6	0.004 2					0.273 5	0.121 5
高动态	fr3_walking_xyz	0.014 3	0.006 7	0.048 6		0.015 0	0.008 6	0.024 7	0.016 1	0.014 8	0.007 7
	fr3_walking_rpy	0.032 8	0.017 3	0.062 9		0.035 0	0.043 7	0.444 2	0.235 0	0.035 8	0.020 7
	fr3_walking_static	0.007 4	0.003 4	0.007 7		0.006 0	0.003 4	0.008 1	0.003 6	0.007 3	0.003 3
	fr3_walking_halfsphere	0.028 6	0.014 2	0.060 8		0.025 0	0.016 1	0.030 3	0.015 9	0.030 3	0.014 6