《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (12): 3715-3722.DOI: 10.11772/j.issn.1001-9081.2021101840

• 人工智能 • 上一篇    

融合注意力特征的遮挡物体6D姿态估计

马康哲1, 皮家甜2(), 熊周兵3, 吕佳1   

  1. 1.重庆师范大学 计算机与信息科学学院, 重庆 401331
    2.重庆国家应用数学中心(重庆师范大学), 重庆 401331
    3.北京理工大学重庆创新中心, 重庆 401120
  • 收稿日期:2021-10-28 修回日期:2021-12-06 接受日期:2021-12-23 发布日期:2022-01-04 出版日期:2022-12-10
  • 通讯作者: 皮家甜
  • 作者简介:马康哲(1996—),男,山西长治人,硕士研究生,主要研究方向:计算机视觉、物体姿态估计
    熊周兵(1984—),男,重庆人,高级工程师,博士,主要研究方向:无人驾驶、感知识别、高精度地图定位
    吕佳(1978—),女,四川达州人,教授,博士,主要研究方向:数据挖掘、机器学习。
  • 基金资助:
    重庆市教委科技项目重点项目(KJZD?K202114802);重庆市教委科技项目青年项目(KJQN201800116);重庆市高校创新群体项目(CXQT20015);重庆市2021年研究生科研创新项目(CYS21272)

6D pose estimation incorporating attentional features for occluded objects

Kangzhe MA1, Jiatian PI2(), Zhoubing XIONG3, Jia LYU1   

  1. 1.College of Computer Information and Science,Chongqing Normal University,Chongqing 401331,China
    2.National Center for Applied Mathematics in Chongqing(Chongqing Normal University),Chongqing 401331,China
    3.Beijing Institute of Technology Chongqing Innovation Center,Chongqing 401120,China
  • Received:2021-10-28 Revised:2021-12-06 Accepted:2021-12-23 Online:2022-01-04 Published:2022-12-10
  • Contact: Jiatian PI
  • About author:MA Kangzhe, born in 1996, M. S. candidate. His research interests include deep learning, object pose estimation.
    XIONG Zhoubing, born in 1984, Ph. D., senior engineer. His research interests include unmanned driving, perceptual recognition,high precision mapping and localization.
    LYU Jia,born in 1978, Ph. D., professor. Her research interests include data mining, machine learning.
  • Supported by:
    Key Project of Chongqing Municipal Education Commission Science and Technology Project(KJZD-K202114802);Youth Project of Chongqing Municipal Education Commission Science and Technology Project(KJQN201800116);Fund of Innovation Research Group of Chongqing(CXQT20015);Graduate Research and Innovation Project in 2021 of Chongqing(CYS21272)

摘要:

在机械臂视觉抓取过程中,现有的算法在复杂背景、光照不足、遮挡等条件下,难以对目标物体进行实时、准确、鲁棒的姿态估计。针对以上问题,提出一种基于关键点方法的融合注意力特征的物体6D姿态网络。首先,在跳跃连接(Skip Connection)阶段引入能够聚焦通道空间信息的卷积注意力模块(CBAM),使编码阶段的浅层特征与解码阶段的深层特征进行有效融合,增强特征图的空间域信息和精确位置通道信息;其次,采用归一化损失函数以弱监督的方式回归每个关键点的注意力图,将注意力图作为对应像素位置上关键点偏移量的权重分数;最后,累加求和得到关键点坐标。实验结果证明,所提网络在LINEMOD数据集和Occlusion LINEMOD数据集上ADD(-S)指标分别达到了91.3%和46.3%。与基于关键点的逐像素投票网络(PVNet)相比ADD(-S)指标分别提升了5.0个百分点和5.5个百分点,验证了所提网络在遮挡场景下有更好的鲁棒性。

关键词: 物体6D姿态估计, 注意力模块, 卷积注意力模块, 遮挡物体, 关键点

Abstract:

In the process of robotic vision grasping, it is difficult for the existing algorithms to perform real-time, accurate and robust pose estimation of the target object under complex background, insufficient illumination, occlusion, etc. Aiming at the above problems, a 6D pose estimation network with fused attention features based on the key point method was proposed. Firstly, Convolutional Block Attention Module (CBAM) was added in the skip connection stage to focus the spatial and channel information, so that the shallow features in the encoding stage were effectively fused with the deep features in the decoding stage, the spatial domain information and accurate position channel information of the feature map were enhanced. Secondly, the attention map of every key point was regressed in a weakly supervised way using a normalized loss function. The attention map was used as the weight of the key point offset at the corresponding pixel position. Finally, the coordinates of keypoints were obtained by accumulating and summing. The experimental results demonstrate that the proposed network reaches 91.3% and 46.3% on the LINEMOD and Occlusion LINEMOD datasets respectively in the ADD(-S) metric. 5.0 percentage points and 5.5 percentage points improvement in the ADD(-S) metric are achieved compared to Pixel Voting Network (PVNet), which verifies that the proposed network improves the robustness of objects in occlusion scenes.

Key words: object 6D pose estimation, attention mechanism, Convolutional Block Attention Module (CBAM), obscured object, key point

中图分类号: