Journal of Computer Applications ›› 2016, Vol. 36 ›› Issue (3): 774-778.DOI: 10.11772/j.issn.1001-9081.2016.03.774

Previous Articles     Next Articles

Three-dimensional SLAM using Kinect and visual dictionary

LONG Chao, HAN Bo, ZHANG Yu   

  1. School of Aeronautics and Astronautics, Zhejiang University, Hangzhou Zhejiang 310027, China
  • Received:2015-08-24 Revised:2015-09-28 Online:2016-03-10 Published:2016-03-17

基于Kinect和视觉词典的三维SLAM

龙超, 韩波, 张宇   

  1. 浙江大学 航空航天学院, 杭州 310027
  • 通讯作者: 龙超
  • 作者简介:龙超(1992-),男,湖北石首人,硕士研究生,主要研究方向:计算机视觉、机器人导航;韩波(1969-),男,浙江慈溪人,副教授,博士,主要研究方向:无人机的导航和控制;张宇(1980-),男,浙江杭州人,讲师,博士,主要研究方向:计算机视觉、无人机的导航和控制。

Abstract: Since traditional filter methods to solve Simultaneous Localization And Mapping (SLAM) problems will accumulate errors, a three-dimensional SLAM algorithm based on Bag-Of-Words (BOW) algorithm which can effectively solves the problem of accumulating errors was proposed. Compared to the common algorithms like random selection and k-Dimensional Tree (Kd-Tree), a tree structure visual bag of words loop detection algorithm was designed which could greatly increase the speed of similar scene detection. Firstly, a GPU based feature extraction algorithm was adopted. Through using cross matching and k-Nearest Neighbor (kNN) algorithm, robust inliers were got. Secondly, Random Sample Consensus Singular Value Decomposition (RANSAC SVD) algorithm was used to calculate the initial transformation between two frames. And then a Generalized-Iterative Closest Point (G-ICP) algorithm was used to optimize the transformation to get precise transformation. At last, incremental Smoothing And Mapping (iSAM) Graph optimization algorithm was used to calculate the camera pose and the point cloud map and trajectory were created. The test results on the standard dataset show that the algorithm can achieve good robustness and precision under complex environment.

Key words: three-dimensional Simultaneous Localization and Mapping(3D-SLAM), loop closure, Bag-Of-Words (BOW), Graphic Processing Unit (GPU), Generalized-Iterative Closest Point (G-ICP), incremental Smoothing and Mapping (iSAM)

摘要: 针对传统滤波器方法解决机器人同时定位与地图创建(SLAM)时的误差积累问题,提出了一种基于视觉词典(BOW)的三维SLAM算法,以有效解决机器人长时间运动下误差积累的问题。相比图优化SLAM中常用的随机检测和Kd树(Kd-Tree)算法,采用基于树结构的视觉词典闭环检测算法来提高相似场景的检索效率。首先采用基于GPU的特征提取算法提取图像特征,并利用交叉匹配和k最近邻(kNN)算法取得图像中鲁棒性较强的内点;然后通过基于随机抽样一致性奇异值分解(RANSAC SVD)算法计算出相邻帧的初始位姿变换,并利用通用迭代最近点(G-ICP)算法进行优化,得到高精度的位姿变换;最后利用增量平滑和建图(iSAM)图优化方法得出最终位姿,拼接出高精度的点云地图和运动轨迹。标准数据集的测试表明,所提算法在复杂情况下具有良好的鲁棒性和精度。

关键词: 三维SLAM, 闭环检测, 视觉词典, 图形处理器, 通用迭代最近点算法, 增量平滑和建图

CLC Number: