《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (10): 3230-3235.DOI: 10.11772/j.issn.1001-9081.2022091398
所属专题: 多媒体计算与计算机仿真
收稿日期:
2022-09-19
修回日期:
2023-02-04
接受日期:
2023-02-08
发布日期:
2023-03-07
出版日期:
2023-10-10
通讯作者:
毕远伟
作者简介:
李传彪(1997—),男,山东济南人,硕士研究生,主要研究方向:双目立体匹配、三维重建;
Received:
2022-09-19
Revised:
2023-02-04
Accepted:
2023-02-08
Online:
2023-03-07
Published:
2023-10-10
Contact:
Yuanwei BI
About author:
LI Chuanbiao, born in 1997, M. S. candidate. His research interests include binocular stereo matching, three-dimensional reconstruction.
摘要:
虽然卷积神经网络(CNN)在有监督立体匹配任务中取得了较好的进展,但多数CNN算法的跨域表现较差。针对跨数据域的立体匹配问题,提出一种基于CNN的使用迁移学习实现域自适应立体匹配任务的跨域自适应立体匹配(CASM-Net)算法。所提算法使用一个可供迁移的特征提取模块提取丰富的广域特征用于跨域立体匹配任务;并且,设计一个自适应代价优化模块,从而通过自适应地利用不同感受野的相似度信息优化代价,进而得到最优的代价分布;此外,提出一个视差分数预测模块,以量化不同区域的立体匹配能力,并通过调整图像的视差搜索范围进一步优化视差结果。实验结果表明:在KITTI2012和KITTI2015数据集上,CASM-Net算法的2-PE-Noc、2-PE-All和3-PE-fg相较于PSMNet(Pyramid Stereo Matching Network)算法分别降低了6.1%、3.3%和19.3%;在Middlebury数据集上,在未经重新训练的情况下,在和其他算法的对比中,CASM-Net算法在所有样本上取得了最优或次优的2-PE结果。可见,CASM-Net算法具有改善跨域立体匹配的作用。
中图分类号:
李传彪, 毕远伟. 基于跨域自适应的立体匹配算法[J]. 计算机应用, 2023, 43(10): 3230-3235.
Chuanbiao LI, Yuanwei BI. Stereo matching algorithm based on cross-domain adaptation[J]. Journal of Computer Applications, 2023, 43(10): 3230-3235.
实验 | 具体设置 | 3-PE/% | KITTI数据集上的推理时间/s | ||
---|---|---|---|---|---|
KITTI | Middlebury | ETH3D | |||
特征提取 | 原始ResNet算法 | 4.6 | 22.93 | 3.53 | 0.270 |
迁移ResNet算法 | 3.9 | 22.65 | 3.47 | 0.270 | |
代价优化 | 单尺度代价优化 | 5.3 | 22.63 | 3.56 | 0.221 |
多尺度代价优化 | 3.5 | 22.01 | 3.24 | 0.230 | |
视差分数预测 | 未预测视差分数 | 4.7 | 23.96 | 3.56 | 0.225 |
预测视差分数 | 3.4 | 22.83 | 3.15 | 0.228 | |
损失函数 | Smooth L1损失 | 4.6 | 23.86 | 3.93 | 0.225 |
Smooth L1损失+MAE损失 | 4.3 | 23.53 | 3.75 | 0.225 |
表1 在多个数据集上不同网络设置的实验结果
Tab. 1 Experimental results of different network settings on multiple datasets
实验 | 具体设置 | 3-PE/% | KITTI数据集上的推理时间/s | ||
---|---|---|---|---|---|
KITTI | Middlebury | ETH3D | |||
特征提取 | 原始ResNet算法 | 4.6 | 22.93 | 3.53 | 0.270 |
迁移ResNet算法 | 3.9 | 22.65 | 3.47 | 0.270 | |
代价优化 | 单尺度代价优化 | 5.3 | 22.63 | 3.56 | 0.221 |
多尺度代价优化 | 3.5 | 22.01 | 3.24 | 0.230 | |
视差分数预测 | 未预测视差分数 | 4.7 | 23.96 | 3.56 | 0.225 |
预测视差分数 | 3.4 | 22.83 | 3.15 | 0.228 | |
损失函数 | Smooth L1损失 | 4.6 | 23.86 | 3.93 | 0.225 |
Smooth L1损失+MAE损失 | 4.3 | 23.53 | 3.75 | 0.225 |
算法 | KITTI2012 | KITTI2015 | 时间/s | |||||
---|---|---|---|---|---|---|---|---|
2-PE-Noc/% | 2-PE-All/% | 3-PE-Noc/% | 3-PE-All/% | 3-PE-bg/% | 3-PE-fg/% | 3-PE-All/% | ||
SGM | 8.66 | 10.16 | 5.76 | 7.00 | 5.06 | 13.00 | 6.38 | |
PSMNet | 1.49 | 1.89 | 4.62 | 2.32 | 0.41 | |||
SegStereo | 2.66 | 3.19 | 1.68 | 2.03 | 1.88 | 0.60 | ||
PBCP | 3.62 | 5.01 | 2.36 | 3.45 | 2.58 | 8.74 | 3.61 | 68.00 |
CRD-Fusion | 6.27 | 7.53 | 4.38 | 5.40 | 4.59 | 13.68 | 6.11 | 0.02 |
iResNet | 2.69 | 3.34 | 1.71 | 2.16 | / | / | / | 0.12 |
CASM-Net | 2.29 | 2.91 | 1.85 | 3.73 | 2.16 | 0.50 |
表2 在KITTI数据集上不同方法的实验结果
Tab. 2 Experimental results of different methods on KITTI datasets
算法 | KITTI2012 | KITTI2015 | 时间/s | |||||
---|---|---|---|---|---|---|---|---|
2-PE-Noc/% | 2-PE-All/% | 3-PE-Noc/% | 3-PE-All/% | 3-PE-bg/% | 3-PE-fg/% | 3-PE-All/% | ||
SGM | 8.66 | 10.16 | 5.76 | 7.00 | 5.06 | 13.00 | 6.38 | |
PSMNet | 1.49 | 1.89 | 4.62 | 2.32 | 0.41 | |||
SegStereo | 2.66 | 3.19 | 1.68 | 2.03 | 1.88 | 0.60 | ||
PBCP | 3.62 | 5.01 | 2.36 | 3.45 | 2.58 | 8.74 | 3.61 | 68.00 |
CRD-Fusion | 6.27 | 7.53 | 4.38 | 5.40 | 4.59 | 13.68 | 6.11 | 0.02 |
iResNet | 2.69 | 3.34 | 1.71 | 2.16 | / | / | / | 0.12 |
CASM-Net | 2.29 | 2.91 | 1.85 | 3.73 | 2.16 | 0.50 |
算法 | Adirondack | ArtL | Motorcycle | Piano | Pipes | Recycle | Teddy |
---|---|---|---|---|---|---|---|
SGM | 14.90 | 15.00 | 14.30 | 22.70 | 15.60 | 8.00 | |
PSMNet | 62.30 | 53.40 | 60.40 | 54.10 | 52.60 | 54.50 | 34.10 |
iResNet | 9.47 | 17.90 | 19.20 | 20.30 | |||
CASM- Net | 11.60 | 15.80 | 12.70 | 9.53 |
表3 在Middlebury数据集上不同算法的2-PE结果 (%)
Tab. 3 2-PE results of different algorithms on Middlebury dataset
算法 | Adirondack | ArtL | Motorcycle | Piano | Pipes | Recycle | Teddy |
---|---|---|---|---|---|---|---|
SGM | 14.90 | 15.00 | 14.30 | 22.70 | 15.60 | 8.00 | |
PSMNet | 62.30 | 53.40 | 60.40 | 54.10 | 52.60 | 54.50 | 34.10 |
iResNet | 9.47 | 17.90 | 19.20 | 20.30 | |||
CASM- Net | 11.60 | 15.80 | 12.70 | 9.53 |
1 | 周思达,邱爽,唐嘉宁,等. 基于深度神经网络的无人机路径决策的研究[J]. 计算机仿真, 2022, 39(6):449-452, 477. 10.3969/j.issn.1006-9348.2022.06.089 |
ZHOU S D, QIU S, TANG J N, et al. Research on path decision of UAV based on deep neural network research[J]. Computer Simulation, 2022, 39(6):449-452, 477. 10.3969/j.issn.1006-9348.2022.06.089 | |
2 | 陆慧敏,杨朔. 基于深度神经网络的自动驾驶场景三维目标检测算法[J]. 北京工业大学学报, 2022, 48(6):589-597. 10.11936/bjutxb2021100027 |
LU H M, YANG S. Three-dimensional object detection algorithm based on deep neural networks for automatic driving[J]. Journal of Beijing University of Technology, 2022, 48(6):589-597. 10.11936/bjutxb2021100027 | |
3 | 吕霁. 基于VR全景图像处理的三维重构算法研究[J]. 安阳师范学院学报, 2022(2):31-34. 10.3969/j.issn.1671-5330.2022.02.008 |
LYU J. Research on 3D reconstruction algorithm based on VR panoramic image processing[J]. Journal of Anyang Normal University, 2022(2):31-34. 10.3969/j.issn.1671-5330.2022.02.008 | |
4 | 黄松梅,毕远伟,许晓. 双目立体匹配算法的研究与实现[J]. 鲁东大学学报(自然科学版), 2018, 34(1):25-30. |
HUANG S M, BI Y W, XU X. Research and implementation of binocular stereo matching algorithms[J]. Journal of Ludong University (Natural Science Edition), 2018, 34(1):25-30. | |
5 | 王启胜,王凤随,陈金刚,等. 融合自适应注意力机制的Faster R-CNN目标检测算法[J]. 激光与光电子学进展, 2022, 59(12): No.1215016. |
WANG Q S, WANG F S, CHEN J G, et al. Faster R-CNN target-detection algorithm fused with adaptive attention mechanism[J]. Laser and Optoelectronics Progress, 2022, 59(12): No.1215016. | |
6 | 张雪晴. 基于CNN的图像分类[J]. 电子技术与软件工程, 2022(7):182-185. |
ZHANG X Q. CNN-based image classification[J]. Electronic Technology and Software Engineering, 2022(7):182-185. | |
7 | ŽBONTAR J, LeCUN Y. Computing the stereo matching cost with a convolutional neural network[C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2015:1592-1599. 10.1109/cvpr.2015.7298767 |
8 | 张亚茹,孔雅婷,刘彬. 多维注意力特征聚合立体匹配算法[J]. 自动化学报, 2022, 48(7):1805-1815. |
ZHANG Y R, KONG Y T, LIU B. Multi-dimensional attention feature aggregation stereo matching algorithm[J]. Acta Automatica Sinica, 2022, 48(7): 1805-1815. | |
9 | KENDALL A, MARTIROSYAN H, DASGUPTA S, et al. End-to-end learning of geometry and context for deep stereo regression[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 66-75. 10.1109/iccv.2017.17 |
10 | RAO Z, HE M, DAI Y, et al. NLCA-Net: a non-local context attention network for stereo matching[J]. APSIPA Transactions on Signal and Information Processing, 2020, 9: No.E18. 10.1017/atsip.2020.16 |
11 | GUO X, YANG K, YANG W, et al. Group-wise correlation stereo network[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 3268-3277. 10.1109/cvpr.2019.00339 |
12 | 中国矿业大学. 基于深度迁移学习的带式输送机煤流量双目视觉测量方法:202011509023.7[P]. 2021-03-26. |
China University of Mining and Technology. Binocular vision measurement method for coal flow of belt conveyor based on deep transfer learning: 202011509023.7[P]. 2021-03-26. | |
13 | KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6): 84-90. 10.1145/3065386 |
14 | 张锡英,王厚博,边继龙. 多成本融合的立体匹配网络[J]. 计算机工程, 2022, 48(2):186-193. |
ZHANG X Y, WANG H B, BIAN J L. Stereo matching network with multi-cost fusion[J]. Computer Engineering, 2022, 48(2):186-193. | |
15 | 邱哲瀚,李扬. 基于稀疏卷积的前景实时双目深度估计算法[J]. 计算机应用, 2021, 41(12):3680-3685. |
QIU Z H, LI Y. Real-time binocular foreground depth estimation algorithm based on sparse convolution[J]. Journal of Computer Applications, 2021, 41(12):3680-3685. | |
16 | 唐家辉,赵芸,徐兴. 一种改进的多尺度引导聚合立体匹配网络研究[J]. 浙江科技学院学报, 2021, 33(5):378-385. 10.3969/j.issn.1671-8798.2021.05.005 |
TANG J H, ZHAO Y, XU X. Research on an improved multi-scale guided aggregation stereo matching network[J]. Journal of Zhejiang University of Science and Technology, 2021, 33(5): 378-385. 10.3969/j.issn.1671-8798.2021.05.005 | |
17 | MAYER N, ILG E, HÄUSSER P, et al. A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 4040-4048. 10.1109/cvpr.2016.438 |
18 | GEIGER A, LENZ P, URTASUN R. Are we ready for autonomous driving? the KITTI vision benchmark suite[C]// Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2012: 3354-3361. 10.1109/cvpr.2012.6248074 |
19 | SCHARSTEIN D, HIRSCHMÜLLER H, KITAJIMA Y, et al. High-resolution stereo datasets with subpixel-accurate ground truth[C]// Proceedings of the 2014 German Conference on Pattern Recognition, LNCS 8753. Cham: Springer, 2014: 31-42. |
20 | SCHÖPS T, SCHÖNBERGER J L, GALLIANI S, et al. A multi-view stereo benchmark with high-resolution images and multi-camera videos[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 2538-2547. 10.1109/cvpr.2017.272 |
21 | HIRSCHMÜLLER H. Accurate and efficient stereo processing by semi-global matching and mutual information[C]// Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition — Volume 2. Piscataway: IEEE, 2005: 807-814. 10.1109/cvpr.2005.4 |
22 | CHANG J R, CHEN Y S. Pyramid stereo matching network[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 5410-5418. 10.1109/cvpr.2018.00567 |
23 | YANG G, ZHAO H, SHI J, et al. SegStereo: exploiting semantic information for disparity estimation[C]// Proceedings of the 2018 European Conference on Computer Vision, LNCS 11211. Cham: Springer, 2018: 660-676. |
24 | SEKI A, POLLEFEYS M. Patch based confidence prediction for dense disparity map[C]// Proceedings of the 2016 British Machine Vision Conference. Durham: BMVA Press, 2016: No.23. 10.5244/c.30.23 |
25 | FAN X, JEON S, FIDAN B. Occlusion-aware self-supervised stereo matching with confidence guided raw disparity fusion[C]// Proceedings of the 19th Conference on Robots and Vision. Piscataway: IEEE, 2022:132-139. 10.1109/crv55824.2022.00025 |
26 | LIANG Z, FENG Y, GUO Y, et al. Learning for disparity estimation through feature constancy[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 2811-2820. 10.1109/cvpr.2018.00297 |
[1] | 秦璟, 秦志光, 李发礼, 彭悦恒. 基于概率稀疏自注意力神经网络的重性抑郁疾患诊断[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2970-2974. |
[2] | 王熙源, 张战成, 徐少康, 张宝成, 罗晓清, 胡伏原. 面向手术导航3D/2D配准的无监督跨域迁移网络[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2911-2918. |
[3] | 李云, 王富铕, 井佩光, 王粟, 肖澳. 基于不确定度感知的帧关联短视频事件检测方法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2903-2910. |
[4] | 陈虹, 齐兵, 金海波, 武聪, 张立昂. 融合1D-CNN与BiGRU的类不平衡流量异常检测[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2493-2499. |
[5] | 赵宇博, 张丽萍, 闫盛, 侯敏, 高茂. 基于改进分段卷积神经网络和知识蒸馏的学科知识实体间关系抽取[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2421-2429. |
[6] | 张春雪, 仇丽青, 孙承爱, 荆彩霞. 基于两阶段动态兴趣识别的购买行为预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2365-2371. |
[7] | 高阳峄, 雷涛, 杜晓刚, 李岁永, 王营博, 闵重丹. 基于像素距离图和四维动态卷积网络的密集人群计数与定位方法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2233-2242. |
[8] | 王东炜, 刘柏辰, 韩志, 王艳美, 唐延东. 基于低秩分解和向量量化的深度网络压缩方法[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 1987-1994. |
[9] | 黄梦源, 常侃, 凌铭阳, 韦新杰, 覃团发. 基于层间引导的低光照图像渐进增强算法[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1911-1919. |
[10] | 李健京, 李贯峰, 秦飞舟, 李卫军. 基于不确定知识图谱嵌入的多关系近似推理模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1751-1759. |
[11] | 沈君凤, 周星辰, 汤灿. 基于改进的提示学习方法的双通道情感分析模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1796-1806. |
[12] | 翟飞宇, 马汉达. 基于DenseNet的经典-量子混合分类模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1905-1910. |
[13] | 姚迅, 秦忠正, 杨捷. 生成式标签对抗的文本分类模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1781-1785. |
[14] | 高文烁, 陈晓云. 基于节点结构的点云分类网络[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1471-1478. |
[15] | 时旺军, 王晶, 宁晓军, 林友芳. 小样本场景下的元迁移学习睡眠分期模型[J]. 《计算机应用》唯一官方网站, 2024, 44(5): 1445-1451. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||