《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (8): 2548-2555.DOI: 10.11772/j.issn.1001-9081.2021050805

• 多媒体计算与计算机仿真 • 上一篇    

对人体姿态估计热图误差的再思考

杨飞宇1,2(), 宋展1, 肖振中2, 莫曜阳2, 陈宇2, 潘哲2, 张敏2, 张遥2, 钱贝贝2, 汤朝伟3, 金武3   

  1. 1.中国科学院 深圳先进技术研究院, 广东 深圳 518055
    2.奥比中光科技集团股份有限公司, 广东 深圳 518063
    3.南京航空航天大学 能源与动力学院, 南京 210016
  • 收稿日期:2021-05-17 修回日期:2021-11-04 接受日期:2021-11-04 发布日期:2022-08-09 出版日期:2022-08-10
  • 通讯作者: 杨飞宇
  • 作者简介:杨飞宇(1990—),男,广东深圳人,博士,主要研究方向:人体姿态估计、图像分割;
    宋展(1980—),男,广东深圳人,教授,博士,主要研究方向:计算机视觉;
    肖振中(1980—),男,广东深圳人,博士,主要研究方向:3D感知与应用;
    莫曜阳(1995—),男,广东东莞人,硕士,主要研究方向:人体姿态估计、图像分割;
    陈宇(1988—),男,湖南衡阳人,硕士,主要研究方向:人体姿态估计;
    潘哲(1993—),男,广东深圳人,硕士,主要研究方向:人体姿态估计;
    张敏(1994—),女,陕西铜川人,硕士,主要研究方向:人体姿态估计;
    张遥(1980—),男,广东深圳人,硕士,主要研究方向:软件设计;
    钱贝贝(1989—),男,广东深圳人,博士,主要研究方向:计算机视觉;
    汤朝伟(1998—),男,江苏南京人,硕士研究生,主要研究方向:数值计算;
    金武(1992—),男,江苏南京人,副教授,博士,主要研究方向:数值计算。

Rethinking errors in human pose estimation heatmap

Feiyu YANG1,2(), Zhan SONG1, Zhenzhong XIAO2, Yaoyang MO2, Yu CHEN2, Zhe PAN2, Min ZHANG2, Yao ZHANG2, Beibei QIAN2, Chaowei TANG3, Wu JIN3   

  1. 1.Shenzhen Institutes of Advanced Technology,Chinese Academy of Sciences,Shenzhen Guangdong 518055,China
    2.Orbbec Technology Incorporation Company Limited,Shenzhen Guangdong 518063,China
    3.College of Energy and Power Engineering,Nanjing University of Aeronautics and Astronautics,Nanjing Jiangsu 210016,China
  • Received:2021-05-17 Revised:2021-11-04 Accepted:2021-11-04 Online:2022-08-09 Published:2022-08-10
  • Contact: Feiyu YANG
  • About author:YANG Feiyu, born in 1990, Ph. D. His research interests include human pose estimation, image segmentation.
    SONG Zhan, born in 1980, Ph. D., professor. His research interests include computer vision.
    XIAO Zhenzhong, born in 1980, Ph. D. His research interests include 3D perception and application.
    MO Yaoyang, born in 1995, M. S. His research interests include human pose estimation, image segmentation.
    CHEN Yu, born in 1988, M. S. His research interests include human pose estimation.
    PAN Zhe, born in 1993, M. S. His research interests include human pose estimation.
    ZHANG Min, born in 1994, M. S. Her research interests include human pose estimation.
    ZHANG Yao, born in 1980, M. S. His research interests include software design.
    QIAN Beibei, born in 1989, Ph. D. His research interests include computer vision.
    TANG Chaowei, born in 1998, M. S. candidate. His research interests include numerical calculation.
    JIN Wu, born in 1992, Ph. D., associate professor. His research interests include numerical calculation.

摘要:

近年来,基于热图的算法一直占据人体姿态估计算法的主导地位。热图解码(即将热图转换为人体关节点坐标)算法是这类算法的基本环节。而当前的热图解码算法并没有考虑系统误差的影响,因此,提出一种基于误差补偿的人体姿态估计热图解码算法。首先在训练过程中评估模型的误差补偿因子,然后在推理阶段用误差补偿因子补偿人体关节点的预测误差,这些误差同时包括系统误差和随机误差。在不同的网络架构、输入分辨率、评估指标和数据集上进行的大量实验的结果表明与目前最佳的热图解码算法相比,所提算法获得了显著的精度增益。具体来说,所提算法使HRNet-W48-256×192模型在COCO(Common Objects in Context)数据集上的平均精度(AP)提升了2.86个百分点,使ResNet-152-256×256模型的相对于头部的正确点百分比指标在MPII(Max Planck Institute for Informatics)数据集上提升了7.8个百分点。此外,由于所提算法不像现存算法需要采用高斯平滑预处理和求导操作,因此速度约为当前最佳算法的2倍。可见,所提算法对于开展高精度、高速度的人体姿态估计具有实际的应用价值。

关键词: 人体姿态估计, 热图, 解码, 误差补偿, 神经网络

Abstract:

Recently, the leading human pose estimation algorithms are heatmap-based algorithms. Heatmap decoding (i.e. transforming heatmaps to coordinates of human joint points) is a basic step of these algorithms. The existing heatmap decoding algorithms neglect the effect of systematic errors. Therefore, an error compensation based heatmap decoding algorithm was proposed. Firstly, an error compensation factor of the system was estimated during training. Then, the error compensation factor was used to compensate the prediction errors including both systematic error and random error of human joint points in the inference stage. Extensive experiments were carried out on different network architectures, input resolutions, evaluation metrics and datasets. The results show that compared with the existing optimal algorithm, the proposed algorithm achieves significant accuracy gain. Specifically, by using the proposed algorithm, the Average Precision (AP) of the HRNet-W48-256×192 model is improved by 2.86 percentage points on Common Objects in COntext (COCO)dataset, and the Percentage of Correct Keypoints with respect to head (PCKh) of the ResNet-152-256×256 model is improved by 7.8 percentage points on Max Planck Institute for Informatics (MPII)dataset. Besides, unlike the existing algorithms, the proposed algorithm did not need Gaussian smoothing preprocessing and derivation operation, so that it is 2 times faster than the existing optimal algorithm. It can be seen that the proposed algorithm has applicable values to performing fast and accurate human pose estimation.

Key words: human pose estimation, heatmap, decoding, error compensation, neural network

中图分类号: