Journal of Computer Applications ›› 2023, Vol. 43 ›› Issue (7): 2155-2165.DOI: 10.11772/j.issn.1001-9081.2022060908

• Artificial intelligence • Previous Articles     Next Articles

Weakly perceived object detection method based on point cloud completion and multi-resolution Transformer

Jing ZHOU1(), Yiyu HU1, Chengyu HU2, Tianjiang WANG3   

  1. 1.School of Artificial Intelligence,Jianghan University,Wuhan Hubei 430056,China
    2.School of Computer Science,China University of Geoscience,Wuhan Hubei 430074,China
    3.School of Computer Science and Technology,Huazhong University of Science and Technology,Wuhan Hubei 430074,China
  • Received:2022-06-23 Revised:2022-09-16 Accepted:2022-09-22 Online:2022-10-18 Published:2023-07-10
  • Contact: Jing ZHOU
  • About author:ZHOU Jing, born in 1981, Ph. D., professor. Her research interests include three-dimensional object detection, deep learning.
    HU Yiyu, born in 1999, M. S. candidate. Her research interests include object detection, deep learning.
    HU Chengyu, born in 1978, Ph. D., professor. His research interests include intelligent computing, deep learning.
    WANG Tianjiang, born in 1960, Ph. D., professor. His research interests include computer vision, deep learning.
  • Supported by:
    National Natural Science Foundation of China(62106086);Natural Science Foundation of Hubei Province(2021CFB564)

基于点云补全和多分辨Transformer的弱感知目标检测方法

周静1(), 胡怡宇1, 胡成玉2, 王天江3   

  1. 1.江汉大学 人工智能学院, 武汉 430056
    2.中国地质大学 计算机学院, 武汉 430074
    3.华中科技大学 计算机科学与技术学院, 武汉 430074
  • 通讯作者: 周静
  • 作者简介:周静(1981—),女,湖北襄阳人,教授,博士,主要研究方向:三维目标检测、深度学习;
    胡怡宇(1999—),女,湖北仙桃人,硕士研究生,主要研究方向:目标检测、深度学习;
    胡成玉(1978—),男,湖北枣阳人,教授,博士,CCF会员,主要研究方向:智能计算、深度学习;
    王天江(1960—),男,湖北武汉人,教授,博士,主要研究方向:计算机视觉、深度学习。
  • 基金资助:
    国家自然科学基金资助项目(62106086);湖北省自然科学基金资助项目(2021CFB564)

Abstract:

To solve the problem of low detection precision of weakly perceived objects with missing shapes in distant or occluded scenes, a Weakly Perceived object detection method based on point cloud Completion and Multi-resolution Transformer (WP-CMT) was proposed. Firstly, since that some key information was lost due to the down-sampling convolution operation in object detection network, the Part-Aware and Aggregation (Part-A2) method with deconvolution up-sampling structure was chosen as the basic network to generate the initial proposals. Then, in order to enhance the shape and position features of the weakly perceived objects in the initial proposals, the point cloud completion module was applied to reconstruct the dense point sets on the surface of the weakly perceptive objects, and a novel multi-resolution Transformer feature encoding module was constructed to aggregate the completed shape features with original spatial location information of the weakly perceived objects, and then the enhanced local features of the weakly perceived objects were captured by encoding the contextual semantic correlation of the aggregated features on local coordinate point sets with different resolutions. Finally, the refined bounding boxes were generated. Experimental results show that WP-CMT achieves 2.51 percentage points gain on average precision and 1.59 percentage points on mean average precision compared to baseline method Part-A2 for the weakly perceived objects at hard level in KITTI and Waymo datasets, which proves the effectiveness of the proposed method for weakly perceived object detection. Meanwhile, ablation experimental results show that the point cloud completion and multi-resolution Transformer feature encoding modules in WP-CMT can effectively improve the detection performance of weakly perceived objects for different Region Proposal Network (RPN) structures.

Key words: three-dimensional object detection, weakly perceived object, point cloud completion, feature encoding, multi-resolution Transformer

摘要:

针对远距离或遮挡场景中形状缺失的弱感知目标的检测精确率低下的问题,提出一种基于点云补全和多分辨Transformer的弱感知目标检测方法(WP-CMT)。首先,考虑到目标检测网络中的下采样卷积操作会导致部分关键信息的丢失,选取具有反卷积上采样结构的部分感知聚合(Part-A2)方法作为基础网络以生成初始候选框;然后,为增强初始候选框中的弱感知目标形状及位置特征,采用点云补全模块重构弱感知目标表面的密集点集,并构建新颖的多分辨Transformer特征编码模块来聚合弱感知目标的补全形状特征和原始空间位置信息,通过逐步编码不同分辨率局部坐标点集上的聚合特征的上下文语义相关性来捕获弱感知目标增强的局部特征,最终生成精细化的目标检测框。实验结果表明:对于KITTI和Waymo数据集中的弱感知困难级别目标,WP-CMT的平均精确率和平均精确率均值分别比基准方法Part-A2提升了2.51和1.59个百分点,验证了该方法对弱感知目标检测的有效性。同时,消融实验结果表明WP-CMT中的点云补全和多分辨Transformer特征编码模块对于不同类型的区域候选网络(RPN)结构均能有效提升弱感知目标的检测性能。

关键词: 三维目标检测, 弱感知目标, 点云补全, 特征编码, 多分辨Transformer

CLC Number: