基于语义分割的室内动态场景同步定位与语义建图

doi:10.11772/j.issn.1001-9081.2019040711

计算机应用 ›› 2019, Vol. 39 ›› Issue (10): 2847-2851.DOI: 10.11772/j.issn.1001-9081.2019040711

基于语义分割的室内动态场景同步定位与语义建图

席志红, 韩双全, 王洪旭

哈尔滨工程大学信息与通信工程学院, 哈尔滨 150001

收稿日期:2019-04-26 修回日期:2019-06-08 发布日期:2019-10-14 出版日期:2019-10-10
通讯作者: 韩双全
作者简介:席志红(1965-),女,黑龙江哈尔滨人,教授,博士,主要研究方向:图像处理、室内定位;韩双全(1993-),男,山东潍坊人,硕士研究生,主要研究方向:视觉SLAM、图像理解;王洪旭(1994-),男,吉林松原人,硕士研究生,主要研究方向:视觉SLAM、图像分析。

Simultaneous localization and semantic mapping of indoor dynamic scene based on semantic segmentation

XI Zhihong, HAN Shuangquan, WANG Hongxu

School of Information and Communication Engineering, Harbin Engineering University, Harbin Heilongjiang 150001, China

Received:2019-04-26 Revised:2019-06-08 Online:2019-10-14 Published:2019-10-10

摘要/Abstract

摘要： 针对动态物体在室内同步定位与地图构建（SLAM）系统中影响位姿估计的问题，提出一种动态场景下基于语义分割的SLAM系统。在相机捕获图像后，首先用PSPNet（Pyramid Scene Parsing Network）对图像进行语义分割；之后提取图像特征点，剔除分布在动态物体内的特征点，并用静态的特征点进行相机位姿估计；最后完成语义点云图和语义八叉树地图的构建。在公开数据集上的五个动态序列进行多次对比测试的结果表明，相对于使用SegNet网络的SLAM系统，所提系统的绝对轨迹误差的标准偏差有6.9%~89.8%的下降，平移和旋转漂移的标准偏差在高动态场景中的最佳效果也能分别提升73.61%和72.90%。结果表明，改进的系统能够显著减小动态场景下位姿估计的误差，准确地在动态场景中进行相机位姿估计。

关键词: 语义分割, 动态场景, 室内场景, 位姿估计, 视觉同步定位与地图构建, 语义同步定位与地图构建

Abstract: To address the problem that dynamic objects affect pose estimation in indoor Simultaneous Localization And Mapping (SLAM) systems, a semantic segmentation based SLAM system in dynamic scenes was proposed. Firstly, an image was semantically segmented by the Pyramid Scene Parsing Network (PSPNet) after being captured by the camera. Then image feature points were extracted, feature points distributed in the dynamic object were removed, and camera pose was estimated by using static feature points. Finally, the semantic point cloud map and semantic octree map were constructed. Results of multiple comparison tests on five dynamic sequences of public datasets show that compared with the SLAM system using SegNet network, the proposed system has the standard deviation of absolute trajectory error improved by 6.9%-89.8%, and has the standard deviation of translation and rotation drift improved by 73.61% and 72.90% respectively in the best case in high dynamic scenes. The results show that the improved method can significantly reduce the error of pose estimation in dynamic scenes, and can correctly estimate the camera pose in dynamic scenes.

Key words: semantic segmentation, dynamic scene, indoor scene, pose estimation, Visual Simultaneous Localization And Mapping (VSLAM), semantic Simultaneous Localization And Mapping (SLAM)

中图分类号:

TP242.6

席志红, 韩双全, 王洪旭. 基于语义分割的室内动态场景同步定位与语义建图[J]. 计算机应用, 2019, 39(10): 2847-2851.

XI Zhihong, HAN Shuangquan, WANG Hongxu. Simultaneous localization and semantic mapping of indoor dynamic scene based on semantic segmentation[J]. Journal of Computer Applications, 2019, 39(10): 2847-2851.

参考文献

[1] CADENA C, CARLONE L, CARRILLO H, et al. Past, present, and future of simultaneous localization and mapping:toward the robust-perception age[J]. IEEE Transactions on Robotics, 2016, 32(6):1309-1332.
[2] LI X, AO H, BELAROUSSI R, et al. Fast semi-dense 3D semantic mapping with monocular visual SLAM[C]//Proceedings of the IEEE 20th International Conference on Intelligent Transportation Systems. Piscataway:IEEE, 2017:385-390.
[3] McCORMAC J, HANDA A, DAVISON A, et al. SemanticFusion:dense 3D semantic mapping with convolutional neural networks[C]//Proceedings of the 2017 IEEE International Conference on Robotics and Automation. Piscataway:IEEE, 2017:4628-4635.
[4] KIM D H, KIM J H. Effective background model-based RGB-D dense visual odometry in a dynamic environment[J]. IEEE Transactions on Robotics, 2016, 32(6):1565-1573.
[5] SUN Y, LIU M, MENG M Q. Improving RGB-D SLAM in dynamic environments:a motion removal approach[J]. Robotics & Autonomous Systems, 2017, 89:110-122.
[6] YU C, LIU Z, LIU X, et al. DS-SLAM:a semantic visual SLAM towards dynamic environments[C]//Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway:IEEE, 2018:1168-1174.
[7] BADRINARAYANAN V, KENDALL A, CIPOLLA R. SegNet:a deep convolutional encoder-decoder architecture for scene Segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12):2481-2495.
[8] LI S, LEE D. RGB-D SLAM in dynamic environments using static point weighting[J]. IEEE Robotics and Automation Letters, 2017, 2(4):2263-2270.
[9] BESCOS B, FÁCIL J M, CIVERA J, et al. DynaSLAM:tracking, mapping, and inpainting in dynamic scenes[J]. IEEE Robotics and Automation Letters, 2018, 3(4):4076-4083.
[10] MUR-ARTAL R, TARDÍS J D. ORB-SLAM2:an open-source SLAM system for monocular, stereo, and RGB-D cameras[J]. IEEE Transactions on Robotics, 2017, 33(5):1255-1262.
[11] ZHAO H, SHI J, QI X, et al. Pyramid scene parsing network[C]//Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2017:6230-6239.
[12] STURM J, ENGELHARD N, ENDRES F, et al. A benchmark for the evaluation of RGB-D SLAM systems[C]//Proceedings of the 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway:IEEE, 2012:573-580.
[13] SHELHAMER E, LONG J, DARRELL T. Fully convolutional networks for semantic segmentation[J]. IEEE Transactions on Pattern Analysis, 2017, 39(4):640-651.
[14] JIA Y, SHELHAMER E, DONAHUE J, et al. Caffe:convolutional architecture for fast feature embedding[EB/OL].[2019-02-10]. https://arxiv.org/pdf/1408.5093.pdf.
[15] EVERINGHAM M, van GOOL L, WILLIAMS C K I, et al. The PASCAL Visual Object Classes Challenge 2012(VOC2012) Resultst[EB/OL].[2019-01-10]. http://host.robots.ox.ac.uk/pascal/VOC/voc2012/.

基于语义分割的室内动态场景同步定位与语义建图

Simultaneous localization and semantic mapping of indoor dynamic scene based on semantic segmentation

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

[1]	李威, 陈玲, 徐修远, 朱敏, 郭际香, 周凯, 牛颢, 张煜宸, 易珊烨, 章毅, 罗凤鸣. 基于多任务学习的间质性肺病分割算法[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1285-1293.
[2]	张鹏飞, 韩李涛, 冯恒健, 李洪梅. 基于注意力机制和全局特征优化的点云语义分割[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1086-1092.
[3]	王铂越, 李英祥, 钟剑丹. 基于改进Res-UNet的昼夜地基云图分割网络[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1310-1316.
[4]	吴宁, 罗杨洋, 许华杰. 基于多尺度特征融合的遥感图像语义分割方法[J]. 《计算机应用》唯一官方网站, 2024, 44(3): 737-744.
[5]	李子怡, 曲婷婷, 崇乾鹏, 徐金东. 基于模糊多尺度特征的遥感图像分割网络[J]. 《计算机应用》唯一官方网站, 2024, 44(11): 3581-3586.
[6]	刘永江, 陈斌. 基于多尺度记忆库的像素级无监督工业异常检测[J]. 《计算机应用》唯一官方网站, 2024, 44(11): 3587-3594.
[7]	阚绪康, 史格非, 杨雪榕. 基于动态特征点滤除与关键帧选择优化的ORB-SLAM2算法[J]. 《计算机应用》唯一官方网站, 2024, 44(10): 3185-3190.
[8]	郑秋梅, 牛薇薇, 王风华, 赵丹. 基于细节增强的双分支实时语义分割网络[J]. 《计算机应用》唯一官方网站, 2024, 44(10): 3058-3066.
[9]	王朱佳, 余宙, 俞俊, 范建平. 基于多尺度时空Transformer的视频动态场景图生成模型[J]. 《计算机应用》唯一官方网站, 2024, 44(1): 47-57.
[10]	周迪, 张自力, 陈佳, 胡新荣, 何儒汉, 张俊. 基于EfficientNetV2和物体上下文表示的胃癌图像分割方法[J]. 《计算机应用》唯一官方网站, 2023, 43(9): 2955-2962.
[11]	王一, 谢杰, 程佳, 豆立伟. 基于深度学习的RGB图像目标位姿估计综述[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2546-2555.
[12]	郑帅, 张晓龙, 邓鹤, 任宏伟. 基于多尺度特征融合和网格注意力机制的三维肝脏影像分割方法[J]. 《计算机应用》唯一官方网站, 2023, 43(7): 2303-2310.
[13]	鲁斌, 柳杰林. 基于特征增强的三维点云语义分割[J]. 《计算机应用》唯一官方网站, 2023, 43(6): 1818-1825.
[14]	袁泉, 徐雲鹏, 唐成亮. 基于路径标签的文档级关系抽取方法[J]. 《计算机应用》唯一官方网站, 2023, 43(4): 1029-1035.
[15]	何雪东, 宣士斌, 王款, 陈梦楠. 融合累积分布函数和通道注意力机制的DeepLabV3+图像分割算法[J]. 《计算机应用》唯一官方网站, 2023, 43(3): 936-942.