Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Monocular 3D object detection method integrating depth and instance segmentation
Xun SUN, Ruifeng FENG, Yanru CHEN
Journal of Computer Applications    2024, 44 (7): 2208-2215.   DOI: 10.11772/j.issn.1001-9081.2023070990
Abstract264)   HTML10)    PDF (4804KB)(195)       Save

To address the limitations of monocular 3D object detection, when encountering changing object size due to changing perspective and occlusion, a new monocular 3D object detection method was proposed fusing depth information with instance segmentation masks. Firstly, with the help of the Depth-Mask Attention Fusion (DMAF) module, depth information was combined with instance segmentation masks to provide more accurate object boundaries. Secondly, dynamic convolution was introduced, and the fused features obtained from the DMAF module were used to guide the generation of dynamic convolution kernels for dealing with objects of different scales. Moreover, a 2D-3D bounding box consistency loss function was introduced into loss function, adjusting the predicted 3D bounding box to highly coincide with corresponding 2D detection box, thereby enhancing performance in instance segmentation and 3D object detection tasks. Lastly, the effectiveness of the proposed method was confirmed through ablation studies and validated on the KITTI test set. The results indicate that, compared to methods using only depth estimation maps and instance segmentation masks, the proposed method improves the average accuracy of vehicle detection under medium difficulty by 6.36 percentage points, and it outperforms comparative techniques like D4LCN (Depth-guided Dynamic-Depthwise-Dilated Local Convolutional Network) and M3D-RPN (Monocular 3D Region Proposal Network) in both 3D object detection and aerial view object detection tasks.

Table and Figures | Reference | Related Articles | Metrics