基于带squeeze-and-excitation模块的ResNeXt的单目图像深度估计方法

doi:10.11772/j.issn.1001-9081.2020060969

计算机应用 ›› 2021, Vol. 41 ›› Issue (1): 215-219.DOI: 10.11772/j.issn.1001-9081.2020060969

所属专题：第八届中国数据挖掘会议(CCDM 2020)

• 第八届中国数据挖掘会议(CCDM 2020) • 上一篇下一篇

基于带squeeze-and-excitation模块的ResNeXt的单目图像深度估计方法

温静, 李智宏

山西大学计算机与信息技术学院, 太原 030006

收稿日期:2020-05-31 修回日期:2020-07-21 出版日期:2021-01-10 发布日期:2020-09-02
通讯作者: 温静
作者简介:温静(1982-),女,山西太原人,副教授,博士,CCF会员,主要研究方向:计算机视觉、图像处理、模式识别;李智宏(1995-),男,山西忻州人,硕士研究生,主要研究方向:计算机视觉、机器学习。
基金资助:
山西省应用基础研究计划项目（201701D121053）。

Monocular image depth estimation method based on ResNeXt with squeeze-and-excitation module

WEN Jing, LI Zhihong

School of Computer and Information Technology, Shanxi University, Taiyuan Shanxi 030006, China

Received:2020-05-31 Revised:2020-07-21 Online:2021-01-10 Published:2020-09-02
Supported by:
This work is partially supported by the Shanxi Applied Basic Research Program (201701D121053).

摘要/Abstract

摘要： 针对目前单目图像深度估计任务缺乏对特征通道之间的全局信息关系表示的问题，提出了一种基于SE-ResNeXt的单目图像深度估计方法。首先，通过建模特征通道间的动态且非线性的关系来提高网络的全局信息表示能力；然后，采用特征重标定策略来自适应地重新校准特征通道的响应，从而进一步提升特征利用率；最后，通过ResNeXt结构在不增加模型复杂度的基础上进一步提升方法的性能。实验结果表明，相比与没有采用ResNeXt结构的算法，该方法获得了更低的误差值，其均方根误差（RMSE）降低了10%，绝对相对误差（AbsRel）降低了27%。

关键词: 单目图像深度估计, 信息聚合, 全局信息, 特征重标定, 特征响应

Abstract: For the lack of the representation of global information relationship between feature channels in existing monocular image depth estimation tasks, a monocular image depth estimation method based on SE-ResNeXt (Squeeze-and-Excitation-ResNeXt) was proposed. Firstly, the global information representation ability of the network was improved by modeling the dynamic and non-linear relationship between the feature channels. Then, the feature re-calibration strategy was introduced to adaptively re-calibrate the response of feature channel in order to further improve the feature utilization. Finally, the performance of the method was improved without increasing the complexity of the model by using the ResNeXt structure. Experimental results show that compared to the algorithm without ResNeXt structure, the proposed algorithm can obtain lower error, and has the Root Mean Squared Error (RMSE) 10% lower and the Absolute Relative error (AbsRel) 27% lower.

Key words: monocular image depth estimation, information aggregation, global information, feature re-calibration, feature response

中图分类号:

TP391.4

温静, 李智宏. 基于带squeeze-and-excitation模块的ResNeXt的单目图像深度估计方法[J]. 计算机应用, 2021, 41(1): 215-219.

WEN Jing, LI Zhihong. Monocular image depth estimation method based on ResNeXt with squeeze-and-excitation module[J]. Journal of Computer Applications, 2021, 41(1): 215-219.

参考文献

[1] CRANDALL D, OWENS A, SNAVELY N, et al. Discretecontinuous optimization for large-scale structure from motion[C]//Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2011:3000-3008.
[2] 黄军, 王聪, 刘越, 等. 单目深度估计技术进展综述[J]. 中国图象图形学报,2019,24(12):2081-2097.(HUANG J,WANG C, LIU Y,et al. The progress of monocular depth estimation technology[J]. Journal of Image and Graphics,2019,24(12):2081-2097.)
[3] 杨帆, 李建平, 李鑫, 等. 基于多任务深度卷积神经网络的显著性对象检测算法[J]. 计算机应用,2018,38(1):91-96.(YANG F,LI J P,LI X,et al. Salient object detection algorithm based on multi-task deep convolutional neural network[J]. Journal of Computer Applications,2018,38(1):91-96.)
[4] 廖斌, 李浩文. 基于多孔卷积神经网络的图像深度估计模型[J]. 计算机应用,2019,39(1):267-274.(LIAO B,LI H W. Image depth estimation model based on atrous convolutional neural network[J]. Journal of Computer Applications,2019,39(1):267-274.)
[5] BALE T L,VALE W W. CRF and CRF receptors:role in stress responsivity and other behaviors[J]. Annual Review of Pharmacology and Toxicology,2004,44:525-557.
[6] LIU F,SHEN C,LIN G,et al. Learning depth from single monocular images using deep convolutional neural fields[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2016, 38(10):2024-2039.
[7] EIGEN D,PUHRSCH C,FERGUS R. Depth map prediction from a single image using a multi-scale deep network[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems. Cambridge:MIT Press,2014:2366-2374.
[8] LAINA I,RUPPRECHT C,BELAGIANNIS V,et al. Deeper depth prediction with fully convolutional residual networks[C]//Proceedings of the 4th International Conference on 3D Vision. Piscataway:IEEE,2016:239-248.
[9] HE K,ZHANG X,REN S,et al. Deep residual learning for image recognition[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2016:770-778.
[10] YANG Z,WANG P,XU W,et al. Unsupervised learning of geometry with edge-aware depth-normal consistency[C]//Proceedings of the 32nd AAAI Conference on Artificial Intelligence. Palo Alto,CA:AAAI Press,2018:7493-7500.
[11] YANG Z,WANG P,WANG Y,et al. LEGO:learning edge with geometry all at once by watching videos[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2018:225-234.
[12] GODARD C, AODHA O M, BROSTOW G J. Unsupervised monocular depth estimation with left-right consistency[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2017:6602-6611.
[13] XIE S,GIRSHICK R,DOLLÁR P,et al. Aggregated residual transformations for deep neural networks[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2017:5987-5995.
[14] HU J,SHEN L,SUN G. Squeeze-and-excitation networks[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2018:7132-7141.
[15] GEIGER A,LENZ P,STILLER C,et al. Vision meets robotics:the KITTI dataset[J]. International Journal of Robotics Research, 2013,32(11):1231-1237.
[16] ZHOU T, BROWN M, SNAVELY N, et al. Unsupervised learning of depth and ego-motion from video[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE,2017:6612-6619.

基于带squeeze-and-excitation模块的ResNeXt的单目图像深度估计方法

Monocular image depth estimation method based on ResNeXt with squeeze-and-excitation module

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 6

编辑推荐

Metrics

[1]	徐周波, 李珍, 刘华东, 李萍. 基于邻居信息聚合的子图同构匹配算法[J]. 计算机应用, 2021, 41(1): 43-47.
[2]	温静, 宋建伟. 基于多级全局信息传递模型的视觉显著性检测[J]. 计算机应用, 2021, 41(1): 208-214.
[3]	屈景怡, 曹磊, 陈敏, 董樑, 曹烨琇. 基于团簇随机连接的CliqueNet航班延误预测模型[J]. 计算机应用, 2020, 40(8): 2420-2427.
[4]	林陶, 黄国荣, 郝顺义, 沈飞. 尺度不变特征转换算法在图像特征提取中的应用[J]. 计算机应用, 2016, 36(6): 1688-1691.
[5]	汤浩, 何楚. 全卷积网络结合改进的条件随机场循环神经网络用于SAR图像场景分类[J]. 计算机应用, 2016, 36(12): 3436-3441.
[6]	解文冲杨英杰汪永伟代向东. 基于任务划分的防信息聚合泄密模型[J]. 计算机应用, 2013, 33(02): 408-416.