Journal of Computer Applications ›› 2025, Vol. 45 ›› Issue (11): 3698-3706.DOI: 10.11772/j.issn.1001-9081.2024111599
• Multimedia computing and computer simulation • Previous Articles
Changjiang JIANG1,2(
), Jie XIANG1,2, Xuying HE1,2
Received:2024-11-11
Revised:2024-12-30
Accepted:2025-01-07
Online:2025-01-14
Published:2025-11-10
Contact:
Changjiang JIANG
About author:XIANG Jie, born in 1997, M. S. candidate. His research interests include computer vision, object detection.Supported by:通讯作者:
蒋畅江
作者简介:向杰(1997—),男,重庆人,硕士研究生,主要研究方向:计算机视觉、目标检测基金资助:CLC Number:
Changjiang JIANG, Jie XIANG, Xuying HE. Binocular vision object localization algorithm for robot arm grasping[J]. Journal of Computer Applications, 2025, 45(11): 3698-3706.
蒋畅江, 向杰, 何旭颖. 面向机械臂抓取的双目视觉目标定位算法[J]. 《计算机应用》唯一官方网站, 2025, 45(11): 3698-3706.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2024111599
| 模型 | FAT数据集 | 真实数据集 | ||
|---|---|---|---|---|
| AP | R | AP | R | |
| w/o all | 79.5 | 81.4 | 48.6 | 54.1 |
| w/o cross | 83.5 | 84.8 | 85.6 | 87.8 |
| w/o real | 86.2 | 87.4 | 46.4 | 52.9 |
| w/o direct | 84.4 | 85.7 | 97.7 | 98.3 |
| w/o recon | 83.2 | 86.9 | 96.2 | 98.2 |
| full | 85.9 | 87.2 | 98.1 | 98.7 |
Tab. 1 Ablation experimental results of object detection unit: %
| 模型 | FAT数据集 | 真实数据集 | ||
|---|---|---|---|---|
| AP | R | AP | R | |
| w/o all | 79.5 | 81.4 | 48.6 | 54.1 |
| w/o cross | 83.5 | 84.8 | 85.6 | 87.8 |
| w/o real | 86.2 | 87.4 | 46.4 | 52.9 |
| w/o direct | 84.4 | 85.7 | 97.7 | 98.3 |
| w/o recon | 83.2 | 86.9 | 96.2 | 98.2 |
| full | 85.9 | 87.2 | 98.1 | 98.7 |
| 模型 | EPE | D1-all | Abs Rel | Sq Rel | RMSE | RMSE log |
|---|---|---|---|---|---|---|
| w/o all | 2.906 8 | 19.30 | 0.129 3 | 0.476 1 | 0.492 7 | 0.167 8 |
| w/o cross | 2.971 9 | 16.72 | 0.138 2 | 0.510 8 | 0.471 3 | 0.170 1 |
| w/o real | 0.503 1 | 2.89 | 0.014 5 | 0.006 9 | 0.070 6 | 0.031 7 |
| w/o direct | 0.592 1 | 3.47 | 0.017 3 | 0.015 1 | 0.088 3 | 0.032 9 |
| w/o recon | 0.553 8 | 3.11 | 0.016 0 | 0.007 7 | 0.075 8 | 0.033 9 |
| full | 0.499 2 | 2.89 | 0.014 3 | 0.006 8 | 0.069 9 | 0.031 5 |
Tab. 2 Ablation experimental results of stereo depth estimation
| 模型 | EPE | D1-all | Abs Rel | Sq Rel | RMSE | RMSE log |
|---|---|---|---|---|---|---|
| w/o all | 2.906 8 | 19.30 | 0.129 3 | 0.476 1 | 0.492 7 | 0.167 8 |
| w/o cross | 2.971 9 | 16.72 | 0.138 2 | 0.510 8 | 0.471 3 | 0.170 1 |
| w/o real | 0.503 1 | 2.89 | 0.014 5 | 0.006 9 | 0.070 6 | 0.031 7 |
| w/o direct | 0.592 1 | 3.47 | 0.017 3 | 0.015 1 | 0.088 3 | 0.032 9 |
| w/o recon | 0.553 8 | 3.11 | 0.016 0 | 0.007 7 | 0.075 8 | 0.033 9 |
| full | 0.499 2 | 2.89 | 0.014 3 | 0.006 8 | 0.069 9 | 0.031 5 |
| 数据集 | 模型 | Abs/mm | Rel/% | Valid/% |
|---|---|---|---|---|
| FAT | w/o all | 23.38 | 2.39 | 81.56 |
| w/o cross | 22.56 | 2.11 | 82.38 | |
| w/o real | 8.76 | 0.80 | 93.86 | |
| w/o direct | 11.31 | 1.08 | 90.98 | |
| w/o recon | 9.58 | 0.87 | 93.06 | |
| full | 8.54 | 0.77 | 93.73 | |
| 真实 | w/o all | 32.98 | 3.65 | 60.75 |
| w/o cross | 26.84 | 3.22 | 81.90 | |
| w/o real | 24.12 | 2.98 | 63.77 | |
| w/o direct | 17.14 | 2.49 | 86.82 | |
| w/o recon | 15.68 | 2.07 | 91.49 | |
| full | 14.55 | 2.05 | 94.12 |
Tab. 3 Ablation experimental results of object localization
| 数据集 | 模型 | Abs/mm | Rel/% | Valid/% |
|---|---|---|---|---|
| FAT | w/o all | 23.38 | 2.39 | 81.56 |
| w/o cross | 22.56 | 2.11 | 82.38 | |
| w/o real | 8.76 | 0.80 | 93.86 | |
| w/o direct | 11.31 | 1.08 | 90.98 | |
| w/o recon | 9.58 | 0.87 | 93.06 | |
| full | 8.54 | 0.77 | 93.73 | |
| 真实 | w/o all | 32.98 | 3.65 | 60.75 |
| w/o cross | 26.84 | 3.22 | 81.90 | |
| w/o real | 24.12 | 2.98 | 63.77 | |
| w/o direct | 17.14 | 2.49 | 86.82 | |
| w/o recon | 15.68 | 2.07 | 91.49 | |
| full | 14.55 | 2.05 | 94.12 |
| 网络 | FAT数据集 | 真实数据集 | ||
|---|---|---|---|---|
| AP | R | AP | R | |
| Stereo R-CNN | 70.6 | 75.2 | 74.3 | 81.9 |
| DSGN | 78.6 | 81.6 | 83.8 | 86.0 |
| YOLOStereo3D | 77.4 | 78.3 | 85.5 | 83.1 |
| CDN | 79.6 | 82.3 | 84.2 | 87.2 |
| Faster RCNN | 69.1 | 73.0 | 71.5 | 75.4 |
| Retina Net | 68.8 | 72.8 | 72.7 | 76.5 |
| Center Net | 72.8 | 74.5 | 73.1 | 76.2 |
| YOLOv5l | 83.1 | 84.7 | 90.7 | 93.0 |
| YOLOv8l | 83.8 | 85.4 | 91.6 | 92.9 |
| BDS-YOLO | 85.9 | 87.2 | 98.1 | 98.7 |
Tab. 4 Experimental results of object detection
| 网络 | FAT数据集 | 真实数据集 | ||
|---|---|---|---|---|
| AP | R | AP | R | |
| Stereo R-CNN | 70.6 | 75.2 | 74.3 | 81.9 |
| DSGN | 78.6 | 81.6 | 83.8 | 86.0 |
| YOLOStereo3D | 77.4 | 78.3 | 85.5 | 83.1 |
| CDN | 79.6 | 82.3 | 84.2 | 87.2 |
| Faster RCNN | 69.1 | 73.0 | 71.5 | 75.4 |
| Retina Net | 68.8 | 72.8 | 72.7 | 76.5 |
| Center Net | 72.8 | 74.5 | 73.1 | 76.2 |
| YOLOv5l | 83.1 | 84.7 | 90.7 | 93.0 |
| YOLOv8l | 83.8 | 85.4 | 91.6 | 92.9 |
| BDS-YOLO | 85.9 | 87.2 | 98.1 | 98.7 |
| 网络 | Abs/mm | Rel/% | Valid/% |
|---|---|---|---|
| Stereo R-CNN | 27.61 | 2.79 | 80.72 |
| DSGN | 19.57 | 1.78 | 87.25 |
| YOLOStereo3D | 21.83 | 2.26 | 83.12 |
| CDN | 17.92 | 1.66 | 88.32 |
| YOLOv8+BBox | 34.85 | 3.47 | 71.49 |
| YOLOv8+Tem | 27.78 | 2.71 | 79.14 |
| YOLOv8+SGBM | 8.92 | 0.81 | 87.12 |
| YOLOv8+CREStereo | 10.41 | 1.18 | 93.21 |
| YOLOv8+RAFT | 11.89 | 1.17 | 92.91 |
| BDS-YOLO | 8.54 | 0.77 | 93.73 |
Tab. 5 Experimental results of object localization on FAT dataset
| 网络 | Abs/mm | Rel/% | Valid/% |
|---|---|---|---|
| Stereo R-CNN | 27.61 | 2.79 | 80.72 |
| DSGN | 19.57 | 1.78 | 87.25 |
| YOLOStereo3D | 21.83 | 2.26 | 83.12 |
| CDN | 17.92 | 1.66 | 88.32 |
| YOLOv8+BBox | 34.85 | 3.47 | 71.49 |
| YOLOv8+Tem | 27.78 | 2.71 | 79.14 |
| YOLOv8+SGBM | 8.92 | 0.81 | 87.12 |
| YOLOv8+CREStereo | 10.41 | 1.18 | 93.21 |
| YOLOv8+RAFT | 11.89 | 1.17 | 92.91 |
| BDS-YOLO | 8.54 | 0.77 | 93.73 |
| 网络 | Abs/mm | Rel/% | Valid/% | Time/ms |
|---|---|---|---|---|
| Stereo R-CNN | 38.34 | 4.53 | 72.36 | 234 |
| DSGN | 28.47 | 3.94 | 78.93 | 472 |
| YOLOStereo3D | 26.32 | 3.79 | 80.51 | 59 |
| CDN | 22.61 | 2.68 | 85.64 | 218 |
| YOLOv8+BBox | 30.21 | 3.65 | 71.86 | 31 |
| YOLOv8+Tem | 27.44 | 3.28 | 77.23 | 57 |
| YOLOv8+SGBM | 30.87 | 3.56 | 81.93 | 98 |
| YOLOv8+CREStereo | 14.73 | 2.23 | 92.28 | 919 |
| YOLOv8+RAFT | 15.25 | 2.10 | 93.88 | 496 |
| BDS-YOLO | 14.55 | 2.05 | 94.12 | 46 |
Tab. 6 Experimental results of object localization on real dataset
| 网络 | Abs/mm | Rel/% | Valid/% | Time/ms |
|---|---|---|---|---|
| Stereo R-CNN | 38.34 | 4.53 | 72.36 | 234 |
| DSGN | 28.47 | 3.94 | 78.93 | 472 |
| YOLOStereo3D | 26.32 | 3.79 | 80.51 | 59 |
| CDN | 22.61 | 2.68 | 85.64 | 218 |
| YOLOv8+BBox | 30.21 | 3.65 | 71.86 | 31 |
| YOLOv8+Tem | 27.44 | 3.28 | 77.23 | 57 |
| YOLOv8+SGBM | 30.87 | 3.56 | 81.93 | 98 |
| YOLOv8+CREStereo | 14.73 | 2.23 | 92.28 | 919 |
| YOLOv8+RAFT | 15.25 | 2.10 | 93.88 | 496 |
| BDS-YOLO | 14.55 | 2.05 | 94.12 | 46 |
| 网络 | EPE | D1-all | Abs Rel | Sq Rel | RMSE | RMSE log |
|---|---|---|---|---|---|---|
| YOLO Stereo | 2.675 0 | 13.53 | 0.106 2 | 0.263 1 | 0.364 2 | 0.283 4 |
| CDN | 1.882 8 | 8.23 | 0.073 7 | 0.094 8 | 0.277 7 | 0.219 6 |
| ACVNet | 1.237 8 | 7.13 | 0.032 5 | 0.024 2 | 0.133 1 | 0.065 9 |
| GMStereo | 1.000 6 | 7.49 | 0.026 8 | 0.015 7 | 0.132 4 | 0.057 2 |
| CREStereo | 0.509 1 | 3.33 | 0.014 7 | 0.008 6 | 0.086 7 | 0.035 7 |
| RAFT-Stereo | 0.488 3 | 3.36 | 0.018 6 | 0.017 1 | 0.111 2 | 0.047 1 |
| BDS -YOLO | 0.499 2 | 2.89 | 0.014 3 | 0.006 8 | 0.069 9 | 0.031 5 |
Tab. 7 Experimental results of stereo depth estimation
| 网络 | EPE | D1-all | Abs Rel | Sq Rel | RMSE | RMSE log |
|---|---|---|---|---|---|---|
| YOLO Stereo | 2.675 0 | 13.53 | 0.106 2 | 0.263 1 | 0.364 2 | 0.283 4 |
| CDN | 1.882 8 | 8.23 | 0.073 7 | 0.094 8 | 0.277 7 | 0.219 6 |
| ACVNet | 1.237 8 | 7.13 | 0.032 5 | 0.024 2 | 0.133 1 | 0.065 9 |
| GMStereo | 1.000 6 | 7.49 | 0.026 8 | 0.015 7 | 0.132 4 | 0.057 2 |
| CREStereo | 0.509 1 | 3.33 | 0.014 7 | 0.008 6 | 0.086 7 | 0.035 7 |
| RAFT-Stereo | 0.488 3 | 3.36 | 0.018 6 | 0.017 1 | 0.111 2 | 0.047 1 |
| BDS -YOLO | 0.499 2 | 2.89 | 0.014 3 | 0.006 8 | 0.069 9 | 0.031 5 |
| [1] | CONG Y, CHEN R, MA B, et al. A comprehensive study of 3-D vision-based robot manipulation[J]. IEEE Transactions on Cybernetics, 2023, 53(3): 1682-1698. |
| [2] | WANG C, CUI X, ZHAO S, et al. The application of deep learning in stereo matching and disparity estimation: a bibliometric review[J]. Expert Systems with Applications, 2024, 238(Pt B): No.122006. |
| [3] | POGGI M, TOSI F, BATSOS K, et al. On the synergies between machine learning and binocular stereo for depth estimation from images: a survey[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(9): 5314-5334. |
| [4] | ZOU Z, CHEN K, SHI Z, et al. Object detection in 20 years: a survey[J]. Proceedings of the IEEE, 2023, 111(3): 257-276. |
| [5] | GAO C, JIANG H, LIU X, et al. Improved binocular localization of kiwifruit in orchard based on fruit and calyx detection using YOLOv5x for robotic picking[J]. Computers and Electronics in Agriculture, 2024, 217: No.108621. |
| [6] | HU H, KAIZU Y, ZHANG H, et al. Recognition and localization of strawberries from 3D binocular cameras for a strawberry picking robot using coupled YOLO/Mask R-CNN[J]. International Journal of Agricultural and Biological Engineering, 2022, 15(6): 175-179. |
| [7] | 魏洪玲,李红岩. 基于深度双目视觉处理的智能采摘机器人设计[J]. 农机化研究, 2024, 46(7): 136-140. |
| WEI H L, LI H Y. Design of intelligent picking robot based on deep binocular vision processing[J]. Journal of Agricultural Mechanization Research, 2024, 46(7): 136-140. | |
| [8] | LAN M, WANG J, ZHU L. Perception and range measurement of sweeping machinery based on enhanced YOLOv8 and binocular vision[J]. IEEE Access, 2023, 11: 126398-126408. |
| [9] | LEI X, WU M, LI Y, et al. Detection and positioning of Camellia oleifera fruit based on LBP image texture matching and binocular stereo vision[J]. Agronomy, 2023, 13(8): No.2153. |
| [10] | LIU T H, NIE X N, WU J M, et al. Pineapple (Ananas comosus) fruit detection and localization in natural environment based on binocular stereo vision and improved YOLOv3 model[J]. Precision Agriculture, 2023, 24(1): 139-160. |
| [11] | TANG Y, ZHOU H, WANG H, et al. Fruit detection and positioning technology for a Camellia oleifera C. Abel orchard based on improved YOLOv4-tiny model and binocular stereo vision[J]. Expert Systems with Applications, 2023, 211: No.118573. |
| [12] | 汪雪林,杜丽学,陈德近,等.基于深度学习和双目视觉的汽车油箱外盖定位[J].计算机应用,2023,43(S1):281-287. |
| WANG X L, DU L X, CHEN D J, et al. Localization of automobile fuel tank cover based on deep learning and binocular vision[J]. Journal of Computer Applications, 2023, 43(S1): 281-287. | |
| [13] | 何君尧,王文胜,韩宜航.结合YOLOv8与双目测距算法的水面漂浮垃圾检测定位系统设计[J].现代电子技术,2024,47(20):1-7. |
| HE J Y, WANG W S, HAN Y H. Design of water surface floating garbage detection and positioning system combining YOLOv8 and binocular ranging algorithm[J]. Modern Electronics Technique, 2024, 47(20): 1-7. | |
| [14] | HIRSCHMÜLLER H. Stereo processing by semiglobal matching and mutual information[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008, 30(2): 328-341. |
| [15] | 朱龙舜,郑旻璐. 基于双目视觉技术的茶叶嫩芽定位方法研究[J]. 农机化研究, 2025, 47(2): 49-53. |
| ZHU L S, ZHENG M L. Research on tea bud location method based on binocular vision technology[J]. Journal of Agricultural Mechanization Research, 2025, 47(2): 49-53. | |
| [16] | 陈泉淦,陈新元,曾镛,等.基于YOLOv5的机耕船双目视觉障碍感知研究[J].中国农机化学报,2024,45(7): 261-268. |
| CHEN Q G, CHEN X Y, ZENG Y, et al. Research on binocular visual impairment perception of a cultivator boat based on YOLOv5[J]. Journal of Chinese Agricultural Mechanization, 2024, 45(7): 261-268. | |
| [17] | 袁斌,郎宇健,陈凌鹏,等. 基于YOLOv5和U-NET的多目标药盒抓取系统设计[J]. 包装工程, 2024, 45(9): 141-149. |
| YUAN B, LANG Y J, CHEN L P, et al. Design of multi-target medicine box grasping system based on YOLOv5 and U-NET[J]. Packaging Engineering, 2024, 45(9): 141-149. | |
| [18] | 郭辉,陈海洋,高国民,等. 基于YOLO v5m的红花花冠目标检测与空间定位方法[J]. 农业机械学报, 2023, 54(7): 272-281. |
| GUO H, CHEN H Y, GAO G M, et al. Safflower corolla object detection and spatial positioning method based on YOLO v5m[J]. Transactions of the Chinese Society for Agricultural Machinery, 2023, 54(7): 272-281. | |
| [19] | CAI L, ZHOU C, WANG Y, et al. Binocular vision-based pole-shaped obstacle detection and ranging study[J]. Applied Sciences, 2023, 13(23): No.12617. |
| [20] | 张奇志,唐凡懿. 双目视觉下钻杆接口定位的实现[J]. 石油机械, 2024, 52(10): 12-19, 73. |
| ZHANG Q Z, TANG F Y. Implementation of drill pipe joint positioning under binocular vision[J]. China Petroleum Machinery, 2024, 52(10): 12-19, 73. | |
| [21] | ZHENG S, LIU Y, WENG W, et al. Tomato recognition and localization method based on improved YOLOv5n-seg model and binocular stereo vision[J]. Agronomy, 2023, 13(9): No.2339. |
| [22] | LIPSON L, TEED Z, DENG J. RAFT-Stereo: multilevel recurrent field transforms for stereo matching[C]// Proceedings of the 2021 International Conference on 3D Vision. Piscataway: IEEE, 2021: 218-227. |
| [23] | WANG H M, LIN H Y, CHANG C C. Object detection and depth estimation approach based on deep convolutional neural networks[J]. Sensors, 2021, 21(14): No.4755. |
| [24] | 成彬,赵彬兵,雷华,等. 基于双目视觉的钢筋绑扎节点定位方法研究[J/OL]. 计算机工程 [2024-11-26].. |
| CHENG B, ZHAO B B, LEI H, et al. Research on the localization method of rebar tying nodes based on binocular vision[J/OL]. Computer Engineering [2024-11-26].. | |
| [25] | 谭斌,王婷. YOLOv5与视差计算算法的目标检测与测距系统设计[J]. 科学技术与工程, 2024, 24(21): 9015-9024. |
| TAN B, WANG T. Design of target detection and ranging system based on YOLOv5 and parallax computing algorithm[J]. Science Technology and Engineering, 2024, 24(21): 9015-9024. | |
| [26] | 邓洪兴,许兴时,王云飞,等. 基于双目立体匹配与改进YOLOv8n-Pose关键点检测的奶牛体尺测量方法[J]. 华南农业大学学报, 2024, 45(5): 802-811. |
| DENG H X, XU X S, WANG Y F, et al. Dairy cow body size measurement method based on binocular stereo matching and improved YOLOv8n-Pose keypoint detection[J]. Journal of South China Agricultural University, 2024, 45(5): 802-811. | |
| [27] | LI J, WANG P, XIONG P, et al. Practical stereo matching via cascaded recurrent network with adaptive correlation[C]// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 16242-16251. |
| [28] | MA X, OUYANG W, SIMONELLI A, et al. 3D object detection from images for autonomous driving: a survey[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024, 46(5): 3537-3556. |
| [29] | LI P, CHEN X, SHEN S. Stereo R-CNN based 3D object detection for autonomous driving[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 7636-7644. |
| [30] | CHEN Y, LIU S, SHEN X, et al. DSGN: deep stereo geometry network for 3D object detection[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 12533-12542. |
| [31] | LIU Y, WANG L, LIU M. YOLOStereo3D: a step back to 2D for efficient stereo 3D detection[C]// Proceedings of the 2021 IEEE International Conference on Robotics and Automation. Piscataway: IEEE, 2021: 13018-13024. |
| [32] | GARG D, WANG Y, HARIHARAN B, et al. Wasserstein distances for stereo disparity estimation[C]// Proceedings of the 34th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2020: 22517-22529. |
| [33] | GODARD C, AODHA O MAC, FIRMAN M, et al. Digging into self-supervised monocular depth estimation[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 3827-3837. |
| [34] | CHI C, WANG Q, HAO T, et al. Feature-level collaboration: joint unsupervised learning of optical flow, stereo depth and camera motion[C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 2463-2473. |
| [35] | GUO X, ZHAO H, SHAO S, et al. F2Depth: self-supervised indoor monocular depth estimation via optical flow consistency and feature map synthesis[J]. Engineering Applications of Artificial Intelligence, 2024, 133(Pt D): No.108391. |
| [36] | ZHANG N, NEX F, VOSSELMAN G, et al. Lite-Mono: a lightweight CNN and Transformer architecture for self-supervised monocular depth estimation[C]// Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2023: 18537-18546. |
| [37] | ZHENG Z, WANG P, LIU W, et al. Distance-IoU loss: faster and better learning for bounding box regression[C]// Proceedings of the 34th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2020: 12993-13000. |
| [38] | LI X, WANG W, WU L, et al. Generalized focal loss: learning qualified and distributed bounding boxes for dense object detection[C]// Proceedings of the 34th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2020: 21002-21012. |
| [39] | TREMBLAY J, TO T, BIRCHFIELD S. Falling Things: a synthetic dataset for 3D object detection and pose estimation[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Piscataway: IEEE, 2018: 2119-2122. |
| [40] | XU G, CHENG J, GUO P, et al. Attention concatenation volume for accurate and efficient stereo matching[C]// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2022: 12971-12980. |
| [41] | XU H, ZHANG J, CAI J, et al. Unifying flow, stereo and depth estimation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(11): 13941-13958. |
| [1] | Weigang LI, Jiale SHAO, Zhiqiang TIAN. Point cloud classification and segmentation network based on dual attention mechanism and multi-scale fusion [J]. Journal of Computer Applications, 2025, 45(9): 3003-3010. |
| [2] | Xiang WANG, Zhixiang CHEN, Guojun MAO. Multivariate time series prediction method combining local and global correlation [J]. Journal of Computer Applications, 2025, 45(9): 2806-2816. |
| [3] | Zhixiong XU, Bo LI, Xiaoyong BIAN, Qiren HU. Adversarial sample embedded attention U-Net for 3D medical image segmentation [J]. Journal of Computer Applications, 2025, 45(9): 3011-3016. |
| [4] | Jiaxiang ZHANG, Xiaoming LI, Jiahui ZHANG. Few-shot object detection algorithm based on new category feature enhancement and metric mechanism [J]. Journal of Computer Applications, 2025, 45(9): 2984-2992. |
| [5] | Jinggang LYU, Shaorui PENG, Shuo GAO, Jin ZHOU. Speech enhancement network driven by complex frequency attention and multi-scale frequency enhancement [J]. Journal of Computer Applications, 2025, 45(9): 2957-2965. |
| [6] | Hongjun ZHANG, Gaojun PAN, Hao YE, Yubin LU, Yiheng MIAO. Multi-source heterogeneous data analysis method combining deep learning and tensor decomposition [J]. Journal of Computer Applications, 2025, 45(9): 2838-2847. |
| [7] | Jin LI, Liqun LIU. SAR and visible image fusion based on residual Swin Transformer [J]. Journal of Computer Applications, 2025, 45(9): 2949-2956. |
| [8] | Bing YIN, Zhenhua LING, Yin LIN, Changfeng XI, Ying LIU. Emotion recognition method compatible with missing modal reasoning [J]. Journal of Computer Applications, 2025, 45(9): 2764-2772. |
| [9] | Panfeng JING, Yudong LIANG, Chaowei LI, Junru GUO, Jinyu GUO. Semi-supervised image dehazing algorithm based on teacher-student learning [J]. Journal of Computer Applications, 2025, 45(9): 2975-2983. |
| [10] | Lili WEI, Lirong YAN, Xiaofen TANG. Contextual semantic representation and pixel relationship correction for few-shot object detection [J]. Journal of Computer Applications, 2025, 45(9): 2993-3002. |
| [11] | Peng PENG, Ziting CAI, Wenling LIU, Caihua CHEN, Wei ZENG, Baolai HUANG. Speech emotion recognition method based on hybrid Siamese network with CNN and bidirectional GRU [J]. Journal of Computer Applications, 2025, 45(8): 2515-2521. |
| [12] | Binhong XIE, Yingkun LA, Yingjun ZHANG, Rui ZHANG. Semi-supervised object detection framework guided by self-paced learning [J]. Journal of Computer Applications, 2025, 45(8): 2546-2554. |
| [13] | Chao JING, Yutao QUAN, Yan CHEN. Improved multi-layer perceptron and attention model-based power consumption prediction algorithm [J]. Journal of Computer Applications, 2025, 45(8): 2646-2655. |
| [14] | Shuo ZHANG, Guokai SUN, Yuan ZHUANG, Xiaoyu FENG, Jingzhi WANG. Dynamic detection method of eclipse attacks for blockchain node analysis [J]. Journal of Computer Applications, 2025, 45(8): 2428-2436. |
| [15] | Jinhao LIN, Chuan LUO, Tianrui LI, Hongmei CHEN. Thoracic disease classification method based on cross-scale attention network [J]. Journal of Computer Applications, 2025, 45(8): 2712-2719. |
| Viewed | ||||||
|
Full text |
|
|||||
|
Abstract |
|
|||||