改进的基于冗余点过滤的3D目标检测方法

doi:10.11772/j.issn.1001-9081.2019122092

计算机应用 ›› 2020, Vol. 40 ›› Issue (9): 2555-2560.DOI: 10.11772/j.issn.1001-9081.2019122092

改进的基于冗余点过滤的3D目标检测方法

宋一凡, 张鹏, 宗立波, 马波, 刘立波

宁夏大学信息工程学院, 银川 750021

收稿日期:2019-12-11 修回日期:2020-03-01 发布日期:2020-04-13 出版日期:2020-09-10
通讯作者: 张鹏
作者简介:宋一凡(1989-),男,宁夏吴忠人,硕士研究生,主要研究方向:深度学习、目标检测;张鹏(1975-),男,宁夏银川人,副教授,博士,CCF会员,主要研究方向:深度学习、智能信息处理;宗立波(1993-),男,山东滨州人,硕士研究生,主要研究方向:智能信息处理;马波(1995-),男,宁夏吴忠人,硕士研究生,主要研究方向:图像处理;刘立波(1974-),女,宁夏银川人,教授,博士,CCF会员,主要研究方向:图像处理。
基金资助:
国家自然科学基金资助项目（61862050）；宁夏重点研发计划项目（2017BY067）。

Improved redundant point filtering-based 3D object detection method

SONG Yifan, ZHANG Peng, ZONG Libo, MA Bo, LIU Libo

School of Information Engineering, Ningxia University, Yinchuan Ningxia 750021, China

Received:2019-12-11 Revised:2020-03-01 Online:2020-04-13 Published:2020-09-10
Supported by:
This work is partially supported by the National Natural Science Foundation of China (61862050), the Ningxia Key Research and Development Plan (2017BY067).

摘要/Abstract

摘要： VoxelNet网络模型是第一个基于点云的端对端目标检测网络，只利用点云数据来生成高精度的3D目标检测框，具有十分良好的效果。但是，VoxelNet使用完整场景的点云数据作为输入，导致耗费了更多的计算资源在背景点云数据上，而且只包含几何信息的点云对目标的识别粒度较低，在较复杂的场景中容易出现误检测和漏检测。针对这些问题对VoxelNet进行了改进，在VoxelNet模型中加入视锥体候选区。首先，通过RGB前视图对感兴趣目标进行定位；然后，将目标2D位置升维至空间视锥体，在点云中提取目标视锥体候选区，过滤冗余点云，仅对视锥体候选区中的点云数据进行计算来得到检测结果。改进后的算法与VoxelNet相比，降低了点云计算量，避免了对背景点云数据的计算，提升了有效运算率，同时，避免了过多背景点的干扰，降低了误检测和漏检测率。KITTI数据集上的实验结果表明，改进后的算法在简单、中等、困难三种模式下的3D平均精度分别为67.92%、59.98%、53.95%，优于VoxelNet模型。

关键词: 3D目标检测, 点云, 冗余点过滤, 深度学习, VoxelNet, 视锥体

Abstract: VoxelNet is the first end-to-end object detection model based on point cloud. Taken only point cloud data as input, it has good effect. However, in VoxelNet, taking point cloud data of full scene as input makes more computation resources use on background point cloud data, and error detection and missing detection are easy to occur in complex scenes because the point cloud with only geometrical information has low recognition granularity on the targets. In order to solve these problems, an improved VoxelNet model with view frustum added was proposed. Firstly, the targets of interest were located by the RGB front view image. Then, the dimension increase was performed on the 2D targets, making the targets into the spatial view frustum. And the view frustum candidate region was extracted in the point cloud to filter out the redundant point cloud, only the point cloud within view frustum candidate region was calculated to obtain the detection results. Compared with VoxelNet, the improved algorithm reduces computation complexity of point cloud, and avoids the calculation of background point cloud data, so as to increase the efficiency of detection. At the same time, it avoids the disturbance of redundant background points and decreases the error detection rate and missing detection rate. The experimental results on KITTI dataset show that the improved algorithm outperforms VoxelNet in 3D detection with 67.92%, 59.98%, 53.95% average precision at easy, moderate and hard level.

Key words: three-dimensional (3D) object detection, point cloud, redundant point filtering, deep learning, VoxelNet, view frustum

中图分类号:

TP391.41

宋一凡, 张鹏, 宗立波, 马波, 刘立波. 改进的基于冗余点过滤的3D目标检测方法[J]. 计算机应用, 2020, 40(9): 2555-2560.

SONG Yifan, ZHANG Peng, ZONG Libo, MA Bo, LIU Libo. Improved redundant point filtering-based 3D object detection method[J]. Journal of Computer Applications, 2020, 40(9): 2555-2560.

参考文献

[1] DENG Z, LATECKI L J. Amodal detection of 3D objects:inferring 3D bounding boxes from 2D ones in RGB-depth images[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2017:398-406.
[2] GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2014:580-587.
[3] HE K, ZHANG X, REN S, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9):1904-1916.
[4] GIRSHICK R. Fast R-CNN[C]//Proceedings of the 2015 IEEE International Conference on Computer Vision. Piscataway:IEEE, 2015:1440-1448.
[5] REN S, HE K, GIRSHICK R, et al. Faster R-CNN:towards realtime object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6):1137-1149.
[6] LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2017:936-944.
[7] GUPTA S, GIRSHICK R, ARBELÁEZ P, et al. Learning rich features from RGB-D images for object detection and segmentation[C]//Proceedings of the 13th European Conference on Computer Vision, LNCS 8695. Cham:Springer, 2014:345-360.
[8] CHEN X, KUNDU K, ZHANG Z, et al. Monocular 3D object detection for autonomous driving[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2016:2147-2156.
[9] CHEN X, KUNDU K, ZHU Y, et al. 3D object proposals using stereo imagery for accurate object class detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(5):1259-1272.
[10] LI B, ZHANG T, XIA T. Vehicle detection from 3D lidar using fully convolutional network[EB/OL].[2019-11-12]. https://arxiv.org/pdf/1608.07916.pdf.
[11] ENGELCKE M, RAO D, WANG D Z, et al. Vote3Deep:fast object detection in 3D point clouds using efficient convolutional neural networks[C]//Proceedings of the 2017 IEEE International Conference on Robotics and Automation. Piscataway:IEEE, 2017:1355-1361.
[12] LI B. 3D fully convolutional network for vehicle detection in point cloud[C]//Proceedings of the 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway:IEEE, 2017:1513-1518.
[13] WANG D Z, POSNER I. Voting for voting in online point cloud object detection[C]//Proceedings of the Robotics:Science and Systems. Rome, Italy, Robotics:Science and Systems Foundation, 2015:1-8.
[14] QI C R, SU H, KAICHUN M, et al. PointNet:deep learning on point sets for 3D classification and segmentation[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2017:77-85.
[15] QI C R, YI L, SU H, et al. PointNet++:deep hierarchical feature learning on point sets in a metric space[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook, NY:Curran Associates Inc., 2017:5099-5108.
[16] CHEN X, MA H, WAN J, et al. Multi-view 3D object detection network for autonomous driving[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2017:6526-6534.
[17] ZHOU Y, TUZEL O. VoxelNet:end-to-end learning for point cloud based 3D object detection[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2018:4490-4499.
[18] GEIGER A, LENZ P, URTASUN R. Are we ready for autonomous driving? The KITTI vision benchmark suite[C]//Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2012:3354-3361.
[19] QI C R, LIU W, WU C, et al. Frustum PointNets for 3D object detection from RGB-D data[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2018:918-927.
[20] SONG S, XIAO J. Deep sliding shapes for amodal 3D object detection in RGB-D images[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2016:808-816.
[21] 骆健,蒋旻,刘星,等. 基于多模态深度学习的RGB-D物体识别[J]. 计算机工程与设计, 2017, 38(6):1624-1629.(LUO J, JIANG M, LIU X, et al. RGB-D object recognition based on multimodal deep learning[J]. Computer Engineering and Design, 2017, 38(6):1624-1629.)
[22] 王旭娇,马杰,王楠楠,等. 基于图卷积网络的深度学习点云分类模型[J]. 激光与光电子学进展, 2019, 56(21):No. 211004. (WANG X J, MA J, WANG N N, et al. Deep learning model for point clouds classification based on graph convolutional network[J]. Laser and Optoelectronics Progress, 2019, 56(21):No. 211004.)

改进的基于冗余点过滤的3D目标检测方法

Improved redundant point filtering-based 3D object detection method

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

[1]	潘烨新, 杨哲. 基于多级特征双向融合的小目标检测优化模型[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2871-2877.
[2]	李顺勇, 李师毅, 胥瑞, 赵兴旺. 基于自注意力融合的不完整多视图聚类算法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2696-2703.
[3]	黄云川, 江永全, 黄骏涛, 杨燕. 基于元图同构网络的分子毒性预测[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2964-2969.
[4]	秦璟, 秦志光, 李发礼, 彭悦恒. 基于概率稀疏自注意力神经网络的重性抑郁疾患诊断[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2970-2974.
[5]	王熙源, 张战成, 徐少康, 张宝成, 罗晓清, 胡伏原. 面向手术导航3D/2D配准的无监督跨域迁移网络[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2911-2918.
[6]	刘禹含, 吉根林, 张红苹. 基于骨架图与混合注意力的视频行人异常检测方法[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2551-2557.
[7]	顾焰杰, 张英俊, 刘晓倩, 周围, 孙威. 基于时空多图融合的交通流量预测[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2618-2625.
[8]	石乾宏, 杨燕, 江永全, 欧阳小草, 范武波, 陈强, 姜涛, 李媛. 面向空气质量预测的多粒度突变拟合网络[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2643-2650.
[9]	赵亦群, 张志禹, 董雪. 基于密集残差物理信息神经网络的各向异性旅行时计算方法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2310-2318.
[10]	徐松, 张文博, 王一帆. 基于时空信息的轻量视频显著性目标检测网络[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2192-2199.
[11]	孙逊, 冯睿锋, 陈彦如. 基于深度与实例分割融合的单目3D目标检测方法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2208-2215.
[12]	吴筝, 程志友, 汪真天, 汪传建, 王胜, 许辉. 基于深度学习的患者麻醉复苏过程中的头部运动幅度分类方法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2258-2263.
[13]	李欢欢, 黄添强, 丁雪梅, 罗海峰, 黄丽清. 基于多尺度时空图卷积网络的交通出行需求预测[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2065-2072.
[14]	张郅, 李欣, 叶乃夫, 胡凯茜. 基于暗知识保护的模型窃取防御技术DKP[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2080-2086.
[15]	赵雅娟, 孟繁军, 徐行健. 在线教育学习者知识追踪综述[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1683-1698.