计算机应用 ›› 2021, Vol. 41 ›› Issue (1): 215-219.DOI: 10.11772/j.issn.1001-9081.2020060969

所属专题: 第八届中国数据挖掘会议(CCDM 2020)

• 第八届中国数据挖掘会议(CCDM 2020) • 上一篇    下一篇

基于带squeeze-and-excitation模块的ResNeXt的单目图像深度估计方法

温静, 李智宏   

  1. 山西大学 计算机与信息技术学院, 太原 030006
  • 收稿日期:2020-05-31 修回日期:2020-07-21 出版日期:2021-01-10 发布日期:2020-09-02
  • 通讯作者: 温静
  • 作者简介:温静(1982-),女,山西太原人,副教授,博士,CCF会员,主要研究方向:计算机视觉、图像处理、模式识别;李智宏(1995-),男,山西忻州人,硕士研究生,主要研究方向:计算机视觉、机器学习。
  • 基金资助:
    山西省应用基础研究计划项目(201701D121053)。

Monocular image depth estimation method based on ResNeXt with squeeze-and-excitation module

WEN Jing, LI Zhihong   

  1. School of Computer and Information Technology, Shanxi University, Taiyuan Shanxi 030006, China
  • Received:2020-05-31 Revised:2020-07-21 Online:2021-01-10 Published:2020-09-02
  • Supported by:
    This work is partially supported by the Shanxi Applied Basic Research Program (201701D121053).

摘要: 针对目前单目图像深度估计任务缺乏对特征通道之间的全局信息关系表示的问题,提出了一种基于SE-ResNeXt的单目图像深度估计方法。首先,通过建模特征通道间的动态且非线性的关系来提高网络的全局信息表示能力;然后,采用特征重标定策略来自适应地重新校准特征通道的响应,从而进一步提升特征利用率;最后,通过ResNeXt结构在不增加模型复杂度的基础上进一步提升方法的性能。实验结果表明,相比与没有采用ResNeXt结构的算法,该方法获得了更低的误差值,其均方根误差(RMSE)降低了10%,绝对相对误差(AbsRel)降低了27%。

关键词: 单目图像深度估计, 信息聚合, 全局信息, 特征重标定, 特征响应

Abstract: For the lack of the representation of global information relationship between feature channels in existing monocular image depth estimation tasks, a monocular image depth estimation method based on SE-ResNeXt (Squeeze-and-Excitation-ResNeXt) was proposed. Firstly, the global information representation ability of the network was improved by modeling the dynamic and non-linear relationship between the feature channels. Then, the feature re-calibration strategy was introduced to adaptively re-calibrate the response of feature channel in order to further improve the feature utilization. Finally, the performance of the method was improved without increasing the complexity of the model by using the ResNeXt structure. Experimental results show that compared to the algorithm without ResNeXt structure, the proposed algorithm can obtain lower error, and has the Root Mean Squared Error (RMSE) 10% lower and the Absolute Relative error (AbsRel) 27% lower.

Key words: monocular image depth estimation, information aggregation, global information, feature re-calibration, feature response

中图分类号: