《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (9): 2853-2857.DOI: 10.11772/j.issn.1001-9081.2021061077

• 多媒体计算与计算机仿真 • 上一篇    

基于目标检测的室内动态场景定位与建图

席志红, 温家旭()   

  1. 哈尔滨工程大学 信息与通信工程学院,哈尔滨 150001
  • 收稿日期:2021-06-25 修回日期:2021-11-20 接受日期:2021-11-24 发布日期:2022-01-25 出版日期:2022-09-10
  • 通讯作者: 温家旭
  • 作者简介:席志红(1965—),女,黑龙江哈尔滨人,教授,博士,主要研究方向:图像处理、室内定位;
  • 基金资助:
    国家自然科学基金资助项目(60875025)

Indoor dynamic scene localization and mapping based on target detection

Zhihong XI, Jiaxu WEN()   

  1. College of Information and Communication Engineering,Harbin Engineering University,Harbin Heilongjiang 150001,China
  • Received:2021-06-25 Revised:2021-11-20 Accepted:2021-11-24 Online:2022-01-25 Published:2022-09-10
  • Contact: Jiaxu WEN
  • About author:XI Zhihong, born in 1965, Ph. D., professor. Her research interests include image processing, indoor localization.
  • Supported by:
    National Natural Science Foundation of China(60875025)

摘要:

针对室内场景中动态对象严重影响相机位姿估计准确性的问题,提出一种基于目标检测的室内动态场景同步定位与地图构建(SLAM)系统。当相机捕获图像后,首先,利用YOLOv4目标检测网络检测环境中的动态对象,并生成对应边界框的掩膜区域;然后,提取图像中的ORB特征点,并将掩膜区域内部的特征点剔除掉;同时结合GMS算法进一步剔除误匹配,并仅利用剩余静态特征点来估计相机位姿;最后,完成滤除动态对象的静态稠密点云地图和八叉树地图的构建。在TUM RGB-D公开数据集上进行的多次对比测试的结果表明,相对于ORB-SLAM2系统、GCNv2_SLAM系统和YOLOv4+ORB-SLAM2系统,所提系统在绝对轨迹误差(ATE)和相对位姿误差(RPE)上有明显的降低,说明该系统能够显著提高室内动态环境中相机位姿估计的准确性。

关键词: 同步定位与地图构建, YOLOv4目标检测, GMS, 静态稠密点云地图, 八叉树地图

Abstract:

Aiming at the problem that dynamic objects in indoor scenes affect the accuracy of camera pose estimation seriously, a Simultaneous Localization And Mapping (SLAM) system for indoor dynamic scenes based on target detection was proposed. After the camera capturing an image, the YOLOv4 target detection network was used to detect dynamic objects in the environment and generate the mask area of the corresponding bounding box at first. Then, the ORB feature points in the image were extracted, and the feature points inside the mask area were removed. At the same time, the GMS (Grid-based Motion Statistics) algorithm was combined to further eliminate mismatches, and only the remaining static feature points were used to estimate the camera pose. Finally, the construction of a static dense point cloud map and an octomap filtering out dynamic objects was completed. Results of multiple comparison tests on TUM RGB-D public dataset show that compared to ORB-SLAM2 system, GCNv2_SLAM system and YOLOv4+ORB-SLAM2 system, the proposed system has the Absolute Trajectory Error (ATE) and Relative Pose Error (RPE) significantly reduced, indicating that this system can improve the accuracy of camera pose estimation in indoor dynamic environments significantly.

Key words: Simultaneous Localization And Mapping (SLAM), YOLOv4 target detection, GMS (Grid-based Motion Statistics), static dense point cloud map, octomap

中图分类号: