Search Result

Journals

Publication Years

Keywords

Please wait a minute...

For Selected:

Download Citations
EndNote Ris BibTeX

Toggle Thumbnails

Select

Monocular 3D object detection method integrating depth and instance segmentation

Xun SUN, Ruifeng FENG, Yanru CHEN

Journal of Computer Applications 2024, 44 (7): 2208-2215. DOI: 10.11772/j.issn.1001-9081.2023070990

Abstract （264）

HTML （10）

PDF （4804KB）（195）

Save

To address the limitations of monocular 3D object detection， when encountering changing object size due to changing perspective and occlusion， a new monocular 3D object detection method was proposed fusing depth information with instance segmentation masks. Firstly， with the help of the Depth-Mask Attention Fusion （DMAF） module， depth information was combined with instance segmentation masks to provide more accurate object boundaries. Secondly， dynamic convolution was introduced， and the fused features obtained from the DMAF module were used to guide the generation of dynamic convolution kernels for dealing with objects of different scales. Moreover， a 2D-3D bounding box consistency loss function was introduced into loss function， adjusting the predicted 3D bounding box to highly coincide with corresponding 2D detection box， thereby enhancing performance in instance segmentation and 3D object detection tasks. Lastly， the effectiveness of the proposed method was confirmed through ablation studies and validated on the KITTI test set. The results indicate that， compared to methods using only depth estimation maps and instance segmentation masks， the proposed method improves the average accuracy of vehicle detection under medium difficulty by 6.36 percentage points， and it outperforms comparative techniques like D4LCN （Depth-guided Dynamic-Depthwise-Dilated Local Convolutional Network） and M3D-RPN （Monocular 3D Region Proposal Network） in both 3D object detection and aerial view object detection tasks.

Table and Figures | Reference | Related Articles | Metrics