《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (10): 3230-3235.DOI: 10.11772/j.issn.1001-9081.2022091398
• 多媒体计算与计算机仿真 • 上一篇
收稿日期:
2022-09-19
修回日期:
2023-02-04
接受日期:
2023-02-08
发布日期:
2023-03-07
出版日期:
2023-10-10
通讯作者:
毕远伟
作者简介:
李传彪(1997—),男,山东济南人,硕士研究生,主要研究方向:双目立体匹配、三维重建;
Received:
2022-09-19
Revised:
2023-02-04
Accepted:
2023-02-08
Online:
2023-03-07
Published:
2023-10-10
Contact:
Yuanwei BI
About author:
LI Chuanbiao, born in 1997, M. S. candidate. His research interests include binocular stereo matching, three-dimensional reconstruction.
摘要:
虽然卷积神经网络(CNN)在有监督立体匹配任务中取得了较好的进展,但多数CNN算法的跨域表现较差。针对跨数据域的立体匹配问题,提出一种基于CNN的使用迁移学习实现域自适应立体匹配任务的跨域自适应立体匹配(CASM-Net)算法。所提算法使用一个可供迁移的特征提取模块提取丰富的广域特征用于跨域立体匹配任务;并且,设计一个自适应代价优化模块,从而通过自适应地利用不同感受野的相似度信息优化代价,进而得到最优的代价分布;此外,提出一个视差分数预测模块,以量化不同区域的立体匹配能力,并通过调整图像的视差搜索范围进一步优化视差结果。实验结果表明:在KITTI2012和KITTI2015数据集上,CASM-Net算法的2-PE-Noc、2-PE-All和3-PE-fg相较于PSMNet(Pyramid Stereo Matching Network)算法分别降低了6.1%、3.3%和19.3%;在Middlebury数据集上,在未经重新训练的情况下,在和其他算法的对比中,CASM-Net算法在所有样本上取得了最优或次优的2-PE结果。可见,CASM-Net算法具有改善跨域立体匹配的作用。
中图分类号:
李传彪, 毕远伟. 基于跨域自适应的立体匹配算法[J]. 计算机应用, 2023, 43(10): 3230-3235.
Chuanbiao LI, Yuanwei BI. Stereo matching algorithm based on cross-domain adaptation[J]. Journal of Computer Applications, 2023, 43(10): 3230-3235.
实验 | 具体设置 | 3-PE/% | KITTI数据集上的推理时间/s | ||
---|---|---|---|---|---|
KITTI | Middlebury | ETH3D | |||
特征提取 | 原始ResNet算法 | 4.6 | 22.93 | 3.53 | 0.270 |
迁移ResNet算法 | 3.9 | 22.65 | 3.47 | 0.270 | |
代价优化 | 单尺度代价优化 | 5.3 | 22.63 | 3.56 | 0.221 |
多尺度代价优化 | 3.5 | 22.01 | 3.24 | 0.230 | |
视差分数预测 | 未预测视差分数 | 4.7 | 23.96 | 3.56 | 0.225 |
预测视差分数 | 3.4 | 22.83 | 3.15 | 0.228 | |
损失函数 | Smooth L1损失 | 4.6 | 23.86 | 3.93 | 0.225 |
Smooth L1损失+MAE损失 | 4.3 | 23.53 | 3.75 | 0.225 |
表1 在多个数据集上不同网络设置的实验结果
Tab. 1 Experimental results of different network settings on multiple datasets
实验 | 具体设置 | 3-PE/% | KITTI数据集上的推理时间/s | ||
---|---|---|---|---|---|
KITTI | Middlebury | ETH3D | |||
特征提取 | 原始ResNet算法 | 4.6 | 22.93 | 3.53 | 0.270 |
迁移ResNet算法 | 3.9 | 22.65 | 3.47 | 0.270 | |
代价优化 | 单尺度代价优化 | 5.3 | 22.63 | 3.56 | 0.221 |
多尺度代价优化 | 3.5 | 22.01 | 3.24 | 0.230 | |
视差分数预测 | 未预测视差分数 | 4.7 | 23.96 | 3.56 | 0.225 |
预测视差分数 | 3.4 | 22.83 | 3.15 | 0.228 | |
损失函数 | Smooth L1损失 | 4.6 | 23.86 | 3.93 | 0.225 |
Smooth L1损失+MAE损失 | 4.3 | 23.53 | 3.75 | 0.225 |
算法 | KITTI2012 | KITTI2015 | 时间/s | |||||
---|---|---|---|---|---|---|---|---|
2-PE-Noc/% | 2-PE-All/% | 3-PE-Noc/% | 3-PE-All/% | 3-PE-bg/% | 3-PE-fg/% | 3-PE-All/% | ||
SGM | 8.66 | 10.16 | 5.76 | 7.00 | 5.06 | 13.00 | 6.38 | |
PSMNet | 1.49 | 1.89 | 4.62 | 2.32 | 0.41 | |||
SegStereo | 2.66 | 3.19 | 1.68 | 2.03 | 1.88 | 0.60 | ||
PBCP | 3.62 | 5.01 | 2.36 | 3.45 | 2.58 | 8.74 | 3.61 | 68.00 |
CRD-Fusion | 6.27 | 7.53 | 4.38 | 5.40 | 4.59 | 13.68 | 6.11 | 0.02 |
iResNet | 2.69 | 3.34 | 1.71 | 2.16 | / | / | / | 0.12 |
CASM-Net | 2.29 | 2.91 | 1.85 | 3.73 | 2.16 | 0.50 |
表2 在KITTI数据集上不同方法的实验结果
Tab. 2 Experimental results of different methods on KITTI datasets
算法 | KITTI2012 | KITTI2015 | 时间/s | |||||
---|---|---|---|---|---|---|---|---|
2-PE-Noc/% | 2-PE-All/% | 3-PE-Noc/% | 3-PE-All/% | 3-PE-bg/% | 3-PE-fg/% | 3-PE-All/% | ||
SGM | 8.66 | 10.16 | 5.76 | 7.00 | 5.06 | 13.00 | 6.38 | |
PSMNet | 1.49 | 1.89 | 4.62 | 2.32 | 0.41 | |||
SegStereo | 2.66 | 3.19 | 1.68 | 2.03 | 1.88 | 0.60 | ||
PBCP | 3.62 | 5.01 | 2.36 | 3.45 | 2.58 | 8.74 | 3.61 | 68.00 |
CRD-Fusion | 6.27 | 7.53 | 4.38 | 5.40 | 4.59 | 13.68 | 6.11 | 0.02 |
iResNet | 2.69 | 3.34 | 1.71 | 2.16 | / | / | / | 0.12 |
CASM-Net | 2.29 | 2.91 | 1.85 | 3.73 | 2.16 | 0.50 |
算法 | Adirondack | ArtL | Motorcycle | Piano | Pipes | Recycle | Teddy |
---|---|---|---|---|---|---|---|
SGM | 14.90 | 15.00 | 14.30 | 22.70 | 15.60 | 8.00 | |
PSMNet | 62.30 | 53.40 | 60.40 | 54.10 | 52.60 | 54.50 | 34.10 |
iResNet | 9.47 | 17.90 | 19.20 | 20.30 | |||
CASM- Net | 11.60 | 15.80 | 12.70 | 9.53 |
表3 在Middlebury数据集上不同算法的2-PE结果 (%)
Tab. 3 2-PE results of different algorithms on Middlebury dataset
算法 | Adirondack | ArtL | Motorcycle | Piano | Pipes | Recycle | Teddy |
---|---|---|---|---|---|---|---|
SGM | 14.90 | 15.00 | 14.30 | 22.70 | 15.60 | 8.00 | |
PSMNet | 62.30 | 53.40 | 60.40 | 54.10 | 52.60 | 54.50 | 34.10 |
iResNet | 9.47 | 17.90 | 19.20 | 20.30 | |||
CASM- Net | 11.60 | 15.80 | 12.70 | 9.53 |
1 | 周思达,邱爽,唐嘉宁,等. 基于深度神经网络的无人机路径决策的研究[J]. 计算机仿真, 2022, 39(6):449-452, 477. 10.3969/j.issn.1006-9348.2022.06.089 |
ZHOU S D, QIU S, TANG J N, et al. Research on path decision of UAV based on deep neural network research[J]. Computer Simulation, 2022, 39(6):449-452, 477. 10.3969/j.issn.1006-9348.2022.06.089 | |
2 | 陆慧敏,杨朔. 基于深度神经网络的自动驾驶场景三维目标检测算法[J]. 北京工业大学学报, 2022, 48(6):589-597. 10.11936/bjutxb2021100027 |
LU H M, YANG S. Three-dimensional object detection algorithm based on deep neural networks for automatic driving[J]. Journal of Beijing University of Technology, 2022, 48(6):589-597. 10.11936/bjutxb2021100027 | |
3 | 吕霁. 基于VR全景图像处理的三维重构算法研究[J]. 安阳师范学院学报, 2022(2):31-34. 10.3969/j.issn.1671-5330.2022.02.008 |
LYU J. Research on 3D reconstruction algorithm based on VR panoramic image processing[J]. Journal of Anyang Normal University, 2022(2):31-34. 10.3969/j.issn.1671-5330.2022.02.008 | |
4 | 黄松梅,毕远伟,许晓. 双目立体匹配算法的研究与实现[J]. 鲁东大学学报(自然科学版), 2018, 34(1):25-30. |
HUANG S M, BI Y W, XU X. Research and implementation of binocular stereo matching algorithms[J]. Journal of Ludong University (Natural Science Edition), 2018, 34(1):25-30. | |
5 | 王启胜,王凤随,陈金刚,等. 融合自适应注意力机制的Faster R-CNN目标检测算法[J]. 激光与光电子学进展, 2022, 59(12): No.1215016. |
WANG Q S, WANG F S, CHEN J G, et al. Faster R-CNN target-detection algorithm fused with adaptive attention mechanism[J]. Laser and Optoelectronics Progress, 2022, 59(12): No.1215016. | |
6 | 张雪晴. 基于CNN的图像分类[J]. 电子技术与软件工程, 2022(7):182-185. |
ZHANG X Q. CNN-based image classification[J]. Electronic Technology and Software Engineering, 2022(7):182-185. | |
7 | ŽBONTAR J, LeCUN Y. Computing the stereo matching cost with a convolutional neural network[C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2015:1592-1599. 10.1109/cvpr.2015.7298767 |
8 | 张亚茹,孔雅婷,刘彬. 多维注意力特征聚合立体匹配算法[J]. 自动化学报, 2022, 48(7):1805-1815. |
ZHANG Y R, KONG Y T, LIU B. Multi-dimensional attention feature aggregation stereo matching algorithm[J]. Acta Automatica Sinica, 2022, 48(7): 1805-1815. | |
9 | KENDALL A, MARTIROSYAN H, DASGUPTA S, et al. End-to-end learning of geometry and context for deep stereo regression[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 66-75. 10.1109/iccv.2017.17 |
10 | RAO Z, HE M, DAI Y, et al. NLCA-Net: a non-local context attention network for stereo matching[J]. APSIPA Transactions on Signal and Information Processing, 2020, 9: No.E18. 10.1017/atsip.2020.16 |
11 | GUO X, YANG K, YANG W, et al. Group-wise correlation stereo network[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 3268-3277. 10.1109/cvpr.2019.00339 |
12 | 中国矿业大学. 基于深度迁移学习的带式输送机煤流量双目视觉测量方法:202011509023.7[P]. 2021-03-26. |
China University of Mining and Technology. Binocular vision measurement method for coal flow of belt conveyor based on deep transfer learning: 202011509023.7[P]. 2021-03-26. | |
13 | KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6): 84-90. 10.1145/3065386 |
14 | 张锡英,王厚博,边继龙. 多成本融合的立体匹配网络[J]. 计算机工程, 2022, 48(2):186-193. |
ZHANG X Y, WANG H B, BIAN J L. Stereo matching network with multi-cost fusion[J]. Computer Engineering, 2022, 48(2):186-193. | |
15 | 邱哲瀚,李扬. 基于稀疏卷积的前景实时双目深度估计算法[J]. 计算机应用, 2021, 41(12):3680-3685. |
QIU Z H, LI Y. Real-time binocular foreground depth estimation algorithm based on sparse convolution[J]. Journal of Computer Applications, 2021, 41(12):3680-3685. | |
16 | 唐家辉,赵芸,徐兴. 一种改进的多尺度引导聚合立体匹配网络研究[J]. 浙江科技学院学报, 2021, 33(5):378-385. 10.3969/j.issn.1671-8798.2021.05.005 |
TANG J H, ZHAO Y, XU X. Research on an improved multi-scale guided aggregation stereo matching network[J]. Journal of Zhejiang University of Science and Technology, 2021, 33(5): 378-385. 10.3969/j.issn.1671-8798.2021.05.005 | |
17 | MAYER N, ILG E, HÄUSSER P, et al. A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 4040-4048. 10.1109/cvpr.2016.438 |
18 | GEIGER A, LENZ P, URTASUN R. Are we ready for autonomous driving? the KITTI vision benchmark suite[C]// Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2012: 3354-3361. 10.1109/cvpr.2012.6248074 |
19 | SCHARSTEIN D, HIRSCHMÜLLER H, KITAJIMA Y, et al. High-resolution stereo datasets with subpixel-accurate ground truth[C]// Proceedings of the 2014 German Conference on Pattern Recognition, LNCS 8753. Cham: Springer, 2014: 31-42. |
20 | SCHÖPS T, SCHÖNBERGER J L, GALLIANI S, et al. A multi-view stereo benchmark with high-resolution images and multi-camera videos[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 2538-2547. 10.1109/cvpr.2017.272 |
21 | HIRSCHMÜLLER H. Accurate and efficient stereo processing by semi-global matching and mutual information[C]// Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition — Volume 2. Piscataway: IEEE, 2005: 807-814. 10.1109/cvpr.2005.4 |
22 | CHANG J R, CHEN Y S. Pyramid stereo matching network[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 5410-5418. 10.1109/cvpr.2018.00567 |
23 | YANG G, ZHAO H, SHI J, et al. SegStereo: exploiting semantic information for disparity estimation[C]// Proceedings of the 2018 European Conference on Computer Vision, LNCS 11211. Cham: Springer, 2018: 660-676. |
24 | SEKI A, POLLEFEYS M. Patch based confidence prediction for dense disparity map[C]// Proceedings of the 2016 British Machine Vision Conference. Durham: BMVA Press, 2016: No.23. 10.5244/c.30.23 |
25 | FAN X, JEON S, FIDAN B. Occlusion-aware self-supervised stereo matching with confidence guided raw disparity fusion[C]// Proceedings of the 19th Conference on Robots and Vision. Piscataway: IEEE, 2022:132-139. 10.1109/crv55824.2022.00025 |
26 | LIANG Z, FENG Y, GUO Y, et al. Learning for disparity estimation through feature constancy[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 2811-2820. 10.1109/cvpr.2018.00297 |
[1] | 尚绍法, 蒋林, 李远成, 朱筠. 异构平台下卷积神经网络推理模型自适应划分和调度方法[J]. 《计算机应用》唯一官方网站, 2023, 43(9): 2828-2835. |
[2] | 路琨婷, 费蓉蓉, 张选德. 融合卷积神经网络的遥感图像全色锐化[J]. 《计算机应用》唯一官方网站, 2023, 43(9): 2963-2969. |
[3] | 陈克正, 郭晓然, 钟勇, 李振平. 基于负训练和迁移学习的关系抽取方法[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2426-2430. |
[4] | 金泽熙, 李磊, 刘继. 基于改进领域分离网络的迁移学习模型[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2382-2389. |
[5] | 李豆豆, 李汪根, 夏义春, 束阳, 高坤. 基于特征交互与自适应融合的骨骼动作识别[J]. 《计算机应用》唯一官方网站, 2023, 43(8): 2581-2587. |
[6] | 轩勃娜, 李进, 宋亚飞, 马泽煊. 基于改进MobileNetV2的恶意代码分类方法[J]. 《计算机应用》唯一官方网站, 2023, 43(7): 2217-2225. |
[7] | 曹春泽, 马德龙, 袁野. 跨域环境下图流三角计数算法GTC[J]. 《计算机应用》唯一官方网站, 2023, 43(7): 2040-2048. |
[8] | 何嘉明, 杨巨成, 吴超, 闫潇宁, 许能华. 基于多模态图卷积神经网络的行人重识别方法[J]. 《计算机应用》唯一官方网站, 2023, 43(7): 2182-2189. |
[9] | 秦源源, 张鸿. 基于注意力特征金字塔网络的肺结节检测算法[J]. 《计算机应用》唯一官方网站, 2023, 43(7): 2311-2318. |
[10] | 张慧斌, 冯丽萍, 郝耀军, 王一宁. 基于注意力机制和迁移学习的古壁画朝代识别[J]. 《计算机应用》唯一官方网站, 2023, 43(6): 1826-1832. |
[11] | 许睿, 梁爽, 万航, 文益民, 沈世铭, 李建. 基于烛台图模式匹配的PM2.5扩散特征的提取[J]. 《计算机应用》唯一官方网站, 2023, 43(5): 1394-1400. |
[12] | 隋佳宏, 毛莺池, 于慧敏, 王子成, 平萍. 基于图注意力网络的全局图像描述生成方法[J]. 《计算机应用》唯一官方网站, 2023, 43(5): 1409-1415. |
[13] | 何建辉, 胡春龙, 束鑫. 基于多峰标签分布学习的多任务年龄估计方法[J]. 《计算机应用》唯一官方网站, 2023, 43(5): 1578-1583. |
[14] | 傅励瑶, 尹梦晓, 杨锋. 基于Transformer的U型医学图像分割网络综述[J]. 《计算机应用》唯一官方网站, 2023, 43(5): 1584-1595. |
[15] | 王彬, 向甜, 吕艺东, 王晓帆. 基于NSGA‑Ⅱ的自适应多尺度特征通道分组优化算法[J]. 《计算机应用》唯一官方网站, 2023, 43(5): 1401-1408. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||