计算机应用 ›› 2019, Vol. 39 ›› Issue (1): 267-274.DOI: 10.11772/j.issn.1001-9081.2018061305

• 虚拟现实与多媒体计算 • 上一篇    下一篇

基于多孔卷积神经网络的图像深度估计模型

廖斌, 李浩文   

  1. 湖北大学 计算机与信息工程学院, 武汉 430062
  • 收稿日期:2018-06-22 修回日期:2018-08-09 出版日期:2019-01-10 发布日期:2019-01-21
  • 通讯作者: 廖斌
  • 作者简介:廖斌(1979-),男,湖北襄阳人,教授,博士,主要研究方向:图像视频处理;李浩文(1993-),男,河南洛阳人,硕士研究生,主要研究方向:图像视频处理。
  • 基金资助:
    国家自然科学基金资助项目(61300125)。

Image depth estimation model based on atrous convolutional neural network

LIAO Bin, LI Haowen   

  1. School of Computer Science and Information Engineering, Hubei University, Wuhan Hubei 430062, China
  • Received:2018-06-22 Revised:2018-08-09 Online:2019-01-10 Published:2019-01-21
  • Supported by:
    This work is partially supported by the National Natural Science Foundation of China (61300125).

摘要: 针对在传统机器学习方法下单幅图像深度估计效果差、深度值获取不准确的问题,提出了一种基于多孔卷积神经网络(ACNN)的深度估计模型。首先,利用卷积神经网络(CNN)逐层提取原始图像的特征图;其次,利用多孔卷积结构,将原始图像中的空间信息与提取到的底层图像特征相互融合,得到初始深度图;最后,将初始深度图送入条件随机场(CRF),联合图像的像素空间位置、灰度及其梯度信息对所得深度图进行优化处理,得到最终深度图。在客观数据集上完成了模型可用性验证及误差估计,实验结果表明,该算法获得了更低的误差值和更高的准确率,均方根误差(RMSE)比基于机器学习的算法平均降低了30.86%,而准确率比基于深度学习的算法提高了14.5%,所提算法在误差数据和视觉效果方面都有较大提升,表明该模型能够在图像深度估计中获得更好的效果。

关键词: 多孔卷积, 卷积神经网络, 条件随机场, 深度估计, 深度学习

Abstract: Focusing on the issues of poor depth estimation and inaccurate depth value acquisition under traditional machine learning methods, a depth estimation model based on Atrous Convolutional Neural Network (ACNN) was proposed. Firstly, the feature map of original image was extracted layer by layer using Convolutional Neural Network (CNN). Secondly, with the atrous convolution structure, the spatial information in original image and the extracted feature map were fused to obtain initial depth map. Finally, the Conditional Random Field (CRF) with combining three constraints, pixel spatial position, grayscale and gradient information were used to optimize initial depth map and obtain final depth map. The model usability verification and error estimation were completed on objective data set. The experimental results show that the proposed algorithm obtains lower error value and higher accuracy. The Root Mean Square Error (RMS) is averagely reduced by 30.86% compared with machine learning based algorithm, and the accuracy is improved by 14.5% compared with deep learning based algorithm. The proposed algorithm has a significant improvement in error reduction and visual effect, indicating that the model can obtain better results in image depth estimation.

Key words: atrous convolution, Convolutional Neural Network (CNN), Conditional Random Field (CRF), depth estimation, deep learning

中图分类号: