Journal of Computer Applications

    Next Articles

Moncular Depth Estimation Based on Scene Flow Compensation in Dynamic Scenes

  

  • Received:2025-11-27 Revised:2026-04-08 Accepted:2026-04-13 Online:2026-04-23 Published:2026-04-23

动态场景下基于场景流补偿的单目深度估计

张瑞欣,于红绯   

  1. 辽宁石油化工大学
  • 通讯作者: 于红绯

Abstract: To address the issue that the confusion between independently moving objects and camera motion in dynamic scenes degrades depth estimation accuracy in self-supervised monocular depth estimation, this paper proposes a depth estimation method based on scene flow compensation. By extracting 3D scene flow and moving object masks, the independent motion of objects is decoupled, which is introduced as a motion prior into the construction of the Compensated Cost Volume to dynamically compensate pixel matching and suppress interference from moving objects. Regarding the model architecture, the proposed model adopts a high-resolution encoder to preserve detailed information, and a channel attention-augmented decoder is designed. Experimental results show that the model achieves an absolute relative error (AbsRel) of 0.098 and a threshold accuracy (δ?) of 0.889 on the KITTI dataset. On the NuScenes dataset with complex dynamic objects, it achieves an AbsRel of 0.149 and a δ? of 0.806. Visualization results demonstrate accurate depth estimation for dynamic objects.

Key words: Self-supervised, Depth estimation, Scene flow, Moving objects, Cost volume

摘要: 针对自监督单目深度估计中,动态场景下自主运动物体与相机运动混淆导致深度估计精度下降的问题,本文提出基于场景流补偿的深度估计方法,通过提取三维场景流和运动物体掩膜解耦物体独立运动,将其作为运动先验引入成本体积(Compensated Cost Volume)构建以动态补偿像素匹配,抑制运动目标干扰。在模型结构方面,所提出模型采用高分辨率编码器以保留细节信息,并设计通道注意力增强解码器。实验结果表明,该模型在 KITTI 数据集上绝对相对误(AbsRel)指标达到了 0.098,阈值准确率(δ1)指标达到了 0.889,在包含复杂动态体的 NuScenes 数据集上 AbsRel 指标达到了0.149,δ1指标达到了0.806,可视化结果展示了对动态物体的正确深度估计。

关键词: 自监督, 深度估计, 场景流, 运动物体, 成本体积

CLC Number: