《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (8): 2546-2555.DOI: 10.11772/j.issn.1001-9081.2022071022

• 多媒体计算与计算机仿真 • 上一篇    下一篇

基于深度学习的RGB图像目标位姿估计综述

王一1,2, 谢杰1(), 程佳1, 豆立伟2,3   

  1. 1.华北理工大学 电气工程学院, 河北 唐山 063210
    2.唐山市金属构件产线智能化技术创新中心, 河北 唐山 063210
    3.唐山贺祥智能科技股份有限公司, 河北 唐山 063000
  • 收稿日期:2022-07-13 修回日期:2022-11-04 接受日期:2022-11-07 发布日期:2023-01-15 出版日期:2023-08-10
  • 通讯作者: 谢杰
  • 作者简介:王一(1981—),男,河北唐山人,副教授,博士,主要研究方向:机器视觉感知、图像处理、精密测量
    程佳(1982—),女,河北唐山人,实验师,硕士,主要研究方向:仪器仪表检测技术、自动化装置
    豆立伟(1983—),男,河北唐山人,助理工程师,主要研究方向:机械设计制造及其自动化。
  • 基金资助:
    河北省高等学校科学研究项目(ZD2022114);唐山市科技计划项目(21130212C)

Review of object pose estimation in RGB images based on deep learning

Yi WANG1,2, Jie XIE1(), Jia CHENG1, Liwei DOU2,3   

  1. 1.College of Electrical Engineering,North China University of Science and Technology,Tangshan Hebei 063210,China
    2.Tangshan Technology Innovation Center of Intellectualization of Metal Component Production Line,Tangshan Hebei 063210,China
    3.Tangshan Hexiang Intelligent Technology Company Limited,Tangshan Hebei 063000,China
  • Received:2022-07-13 Revised:2022-11-04 Accepted:2022-11-07 Online:2023-01-15 Published:2023-08-10
  • Contact: Jie XIE
  • About author:WANG Yi, born in 1981, Ph. D., associate professor. His research interests include machine vision perception, image processing, precision measurement.
    CHENG Jia, born in 1982, M. S., experimentalist. Her research interests include instrumentation and instrument detection technology, automation device.
    DOU Liwei, born in 1983, assistant engineer. His research interests include mechanical design, manufacturing and automation.
  • Supported by:
    Scientific Research Project of Higher Education Institutions of Hebei Province(ZD2022114);Tangshan Science and Technology Program(21130212C)

摘要:

6自由度(DoF)位姿估计是计算机视觉与机器人技术中的一项关键技术,它能从给定的输入图像中估计物体的6DoF位姿,即3DoF平移和3DoF旋转,已经成为机器人操作、自动驾驶、增强现实等领域中的一项至关重要的任务。首先,介绍了6DoF位姿的概念以及基于特征点对应、基于模板匹配、基于三维特征描述符等传统方法存在的问题;然后,以基于特征对应、基于像素投票、基于回归和面向多物体实例、面向合成数据、面向类别级的不同角度详细介绍了当前主流的基于深度学习的6DoF位姿估计算法,归纳整理了在位姿估计方面常用的数据集以及评价指标,并对部分算法进行了实验性能评价;最后,给出了当前位姿估计面临的挑战和未来的重点研究方向。

关键词: 6自由度位姿估计, 位姿估计数据集, 位姿估计评价方法, 深度学习, 计算机视觉, 工业机器人

Abstract:

6 Degree of Freedom (DoF) pose estimation is a key technology in computer vision and robotics, and has become a crucial task in the fields such as robot operation, automatic driving, augmented reality by estimating 6 DoF pose of an object from a given input image, that is, 3 DoF translation and 3 DoF rotation. Firstly, the concept of 6 DoF pose and the problems of traditional methods based on feature point correspondence, template matching, and three-dimensional feature descriptors were introduced. Then, the current mainstream 6 DoF pose estimation algorithms based on deep learning were introduced in detail from different angles of feature correspondence-based, pixel voting-based, regression-based and multi-object instances-oriented, synthesis data-oriented, and category level-oriented. At the same time, the datasets and evaluation indicators commonly used in pose estimation were summarized and sorted out, and some algorithms were evaluated experimentally to show their performance. Finally, the challenges and the key research directions in the future of pose estimation were given.

Key words: 6-degree of freedom pose estimation, pose estimation dataset, pose estimation evaluation method, deep learning, computer vision, industrial robot

中图分类号: