Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Binocular vision object localization algorithm for robot arm grasping
Changjiang JIANG, Jie XIANG, Xuying HE
Journal of Computer Applications    2025, 45 (11): 3698-3706.   DOI: 10.11772/j.issn.1001-9081.2024111599
Abstract42)   HTML0)    PDF (1975KB)(328)       Save

Recognizing the object and locating its spatial coordinates using machine vision algorithm is crucial for achieving visual grasping with robotic arms. Aiming at the problems of low localization accuracy and inefficient performance in binocular vision-based object recognition and localization, a BDS-YOLO (Binocular Detect and Stereo YOLO)-based binocular vision object localization algorithm for robotic arm grasping was proposed, which joints object detection and stereo depth estimation. The algorithm integrated object detection with stereo depth estimation algorithm, leveraging attention mechanisms for cross-view feature interaction to enhance feature representation. This enabled the network to obtain high-quality disparity maps through depth feature matching. After being further improved through self-attention mechanism, the disparity maps were converted into depth information using triangulation principle. BDS-YOLO network adopted multi-task learning to jointly train both object detection and stereo depth estimation networks using both synthetic and real-world data. To overcome the challenge of annotating dense depth for real data, self-supervised learning technology was applied to optimize the image reconstruction process from disparities, improving generalization ability of the BDS-YOLO network in real-world scenarios. Experimental results show that BDS-YOLO network achieves a 6.5 percentage points higher Average Precision (AP) in object detection compared to YOLOv8l on real-world dataset, outperforms specialized stereo depth estimation algorithm in disparity prediction and depth conversion, achieves an inference speed of over 20 frame/s, and surpasses comparative methods in both object recognition and localization. It can be seen that BDS-YOLO network can meet the requirements for real-time object detection and localization.

Table and Figures | Reference | Related Articles | Metrics