Journal of Computer Applications

    Next Articles

Cross-Attention Multi-Modal Point Cloud Completion Network

  

  • Received:2025-02-28 Revised:2025-03-20 Online:2025-03-24 Published:2025-03-24

基于交叉注意力的多模态点云补全网络

廖泽鑫1,张绍兵2,成苗2   

  1. 1. 中国科学院大学成都计算机应用研究所
    2. 深圳市中钞科信金融科技有限公司
  • 通讯作者: 廖泽鑫

Abstract: To address the issue of incomplete point clouds caused by occlusion, concavities, illumination, and other factors during object scanning with LiDAR devices, this study proposes a 2D view-guided 3D point cloud completion method. Existing view-guided approaches typically fuse 2D and 3D information either explicitly or implicitly. Considering that the fused features should simultaneously retain global information from 2D data and local details for point cloud refinement, we design a novel network that integrates 2D views with point clouds for completion. The network operates in two stages: feature extraction/encoding and feature decoding/point generation. In the first stage, DGCNN and ResNet are employed to extract features from point clouds and 2D views, respectively. These features are then fused via an Attention mechanism to generate hybrid representations, followed by downsampling to obtain global features. In the second stage, the fused features are decoded using Attention mechanisms and upsampled via transposed convolution to reconstruct complete point clouds. Experimental results demonstrate that our method achieves superior Chamfer Distance (CD) metrics on the ShapeNet-ViPC dataset compared to state-of-the-art single-modal and multi-modal point cloud completion approaches.

Key words: point cloud, point cloud completion, multi-modelity, self-supervised, Cross-Attention, geometry-aware

摘要: 针对当前Lidar等设备对于物体扫描时常会出现遮掩、凹陷、光照等因素造成扫描出来的点云残缺问题,进行了一种通过2D视图引导3D点云补全的研究。现有的视图引导方法基本都是显式或隐式的将视图信息与点云信息融合,考虑到融合的信息应该同时具有2D数据的全局信息以及点云细化的局部信息,对此提出了一种2D视图与点云融合进行补全的网络。网络主要分成两个阶段,特征提取并编码,特征解码并生成点云,在第一阶段中先通过DGCNN和Resnet分别对点云和视图进行特征提取,然后通过Attention机制融合两种信息生成融合特征,对融合特征采取下采样得到全局特征;第二阶段通过Attention机制进行解码,并使用转置卷积进行上采样生成点云。通过实验,该网络得到的Chamfer-Distance指标在ShapeNet-Vipc数据集上优于最新的单模态和多模态点云补全方法。

关键词: 点云, 点云补全, 多模态, 自监督, 交叉注意力, 几何感知

CLC Number: