计算机应用 ›› 2020, Vol. 40 ›› Issue (9): 2555-2560.DOI: 10.11772/j.issn.1001-9081.2019122092

• 人工智能 • 上一篇    下一篇

改进的基于冗余点过滤的3D目标检测方法

宋一凡, 张鹏, 宗立波, 马波, 刘立波   

  1. 宁夏大学 信息工程学院, 银川 750021
  • 收稿日期:2019-12-11 修回日期:2020-03-01 出版日期:2020-09-10 发布日期:2020-04-13
  • 通讯作者: 张鹏
  • 作者简介:宋一凡(1989-),男,宁夏吴忠人,硕士研究生,主要研究方向:深度学习、目标检测;张鹏(1975-),男,宁夏银川人,副教授,博士,CCF会员,主要研究方向:深度学习、智能信息处理;宗立波(1993-),男,山东滨州人,硕士研究生,主要研究方向:智能信息处理;马波(1995-),男,宁夏吴忠人,硕士研究生,主要研究方向:图像处理;刘立波(1974-),女,宁夏银川人,教授,博士,CCF会员,主要研究方向:图像处理。
  • 基金资助:
    国家自然科学基金资助项目(61862050);宁夏重点研发计划项目(2017BY067)。

Improved redundant point filtering-based 3D object detection method

SONG Yifan, ZHANG Peng, ZONG Libo, MA Bo, LIU Libo   

  1. School of Information Engineering, Ningxia University, Yinchuan Ningxia 750021, China
  • Received:2019-12-11 Revised:2020-03-01 Online:2020-09-10 Published:2020-04-13
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (61862050), the Ningxia Key Research and Development Plan (2017BY067).

摘要: VoxelNet网络模型是第一个基于点云的端对端目标检测网络,只利用点云数据来生成高精度的3D目标检测框,具有十分良好的效果。但是,VoxelNet使用完整场景的点云数据作为输入,导致耗费了更多的计算资源在背景点云数据上,而且只包含几何信息的点云对目标的识别粒度较低,在较复杂的场景中容易出现误检测和漏检测。针对这些问题对VoxelNet进行了改进,在VoxelNet模型中加入视锥体候选区。首先,通过RGB前视图对感兴趣目标进行定位;然后,将目标2D位置升维至空间视锥体,在点云中提取目标视锥体候选区,过滤冗余点云,仅对视锥体候选区中的点云数据进行计算来得到检测结果。改进后的算法与VoxelNet相比,降低了点云计算量,避免了对背景点云数据的计算,提升了有效运算率,同时,避免了过多背景点的干扰,降低了误检测和漏检测率。KITTI数据集上的实验结果表明,改进后的算法在简单、中等、困难三种模式下的3D平均精度分别为67.92%、59.98%、53.95%,优于VoxelNet模型。

关键词: 3D目标检测, 点云, 冗余点过滤, 深度学习, VoxelNet, 视锥体

Abstract: VoxelNet is the first end-to-end object detection model based on point cloud. Taken only point cloud data as input, it has good effect. However, in VoxelNet, taking point cloud data of full scene as input makes more computation resources use on background point cloud data, and error detection and missing detection are easy to occur in complex scenes because the point cloud with only geometrical information has low recognition granularity on the targets. In order to solve these problems, an improved VoxelNet model with view frustum added was proposed. Firstly, the targets of interest were located by the RGB front view image. Then, the dimension increase was performed on the 2D targets, making the targets into the spatial view frustum. And the view frustum candidate region was extracted in the point cloud to filter out the redundant point cloud, only the point cloud within view frustum candidate region was calculated to obtain the detection results. Compared with VoxelNet, the improved algorithm reduces computation complexity of point cloud, and avoids the calculation of background point cloud data, so as to increase the efficiency of detection. At the same time, it avoids the disturbance of redundant background points and decreases the error detection rate and missing detection rate. The experimental results on KITTI dataset show that the improved algorithm outperforms VoxelNet in 3D detection with 67.92%, 59.98%, 53.95% average precision at easy, moderate and hard level.

Key words: three-dimensional (3D) object detection, point cloud, redundant point filtering, deep learning, VoxelNet, view frustum

中图分类号: