《计算机应用》唯一官方网站 ›› 2022, Vol. 42 ›› Issue (8): 2548-2555.DOI: 10.11772/j.issn.1001-9081.2021050805
• 多媒体计算与计算机仿真 • 上一篇
杨飞宇1,2(), 宋展1, 肖振中2, 莫曜阳2, 陈宇2, 潘哲2, 张敏2, 张遥2, 钱贝贝2, 汤朝伟3, 金武3
收稿日期:
2021-05-17
修回日期:
2021-11-04
接受日期:
2021-11-04
发布日期:
2022-08-09
出版日期:
2022-08-10
通讯作者:
杨飞宇
作者简介:
杨飞宇(1990—),男,广东深圳人,博士,主要研究方向:人体姿态估计、图像分割;Feiyu YANG1,2(), Zhan SONG1, Zhenzhong XIAO2, Yaoyang MO2, Yu CHEN2, Zhe PAN2, Min ZHANG2, Yao ZHANG2, Beibei QIAN2, Chaowei TANG3, Wu JIN3
Received:
2021-05-17
Revised:
2021-11-04
Accepted:
2021-11-04
Online:
2022-08-09
Published:
2022-08-10
Contact:
Feiyu YANG
About author:
YANG Feiyu, born in 1990, Ph. D. His research interests include human pose estimation, image segmentation.摘要:
近年来,基于热图的算法一直占据人体姿态估计算法的主导地位。热图解码(即将热图转换为人体关节点坐标)算法是这类算法的基本环节。而当前的热图解码算法并没有考虑系统误差的影响,因此,提出一种基于误差补偿的人体姿态估计热图解码算法。首先在训练过程中评估模型的误差补偿因子,然后在推理阶段用误差补偿因子补偿人体关节点的预测误差,这些误差同时包括系统误差和随机误差。在不同的网络架构、输入分辨率、评估指标和数据集上进行的大量实验的结果表明与目前最佳的热图解码算法相比,所提算法获得了显著的精度增益。具体来说,所提算法使HRNet-W48-256×192模型在COCO(Common Objects in Context)数据集上的平均精度(AP)提升了2.86个百分点,使ResNet-152-256×256模型的相对于头部的正确点百分比指标在MPII(Max Planck Institute for Informatics)数据集上提升了7.8个百分点。此外,由于所提算法不像现存算法需要采用高斯平滑预处理和求导操作,因此速度约为当前最佳算法的2倍。可见,所提算法对于开展高精度、高速度的人体姿态估计具有实际的应用价值。
中图分类号:
杨飞宇, 宋展, 肖振中, 莫曜阳, 陈宇, 潘哲, 张敏, 张遥, 钱贝贝, 汤朝伟, 金武. 对人体姿态估计热图误差的再思考[J]. 计算机应用, 2022, 42(8): 2548-2555.
Feiyu YANG, Zhan SONG, Zhenzhong XIAO, Yaoyang MO, Yu CHEN, Zhe PAN, Min ZHANG, Yao ZHANG, Beibei QIAN, Chaowei TANG, Wu JIN. Rethinking errors in human pose estimation heatmap[J]. Journal of Computer Applications, 2022, 42(8): 2548-2555.
模型 | 输入分辨率 | |
---|---|---|
ResNet-50 | 256×192 | 4 |
256×256 | 4 | |
384×288 | 5 | |
ResNet-101 | 256×192 | 4 |
256×256 | 4 | |
384×288 | 5 | |
ResNet-152 | 256×192 | 4 |
256×256 | 4 | |
384×288 | 5 | |
HR-W32 | 256×192 | 4 |
384×288 | 5 | |
HR-W48 | 256×192 | 4 |
384×288 | 5 |
表1 各个模型的最优误差补偿因子Δopt值
Tab. 1 Optimal error compensation factor Δopt value of each model
模型 | 输入分辨率 | |
---|---|---|
ResNet-50 | 256×192 | 4 |
256×256 | 4 | |
384×288 | 5 | |
ResNet-101 | 256×192 | 4 |
256×256 | 4 | |
384×288 | 5 | |
ResNet-152 | 256×192 | 4 |
256×256 | 4 | |
384×288 | 5 | |
HR-W32 | 256×192 | 4 |
384×288 | 5 | |
HR-W48 | 256×192 | 4 |
384×288 | 5 |
模型 | 输入分辨率 | 算法 | 精度/% | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
AP | AP50 | AP75 | APM | APL | AR | AR50 | AR75 | ARM | ARL | |||
ResNet-50 | 256×192 | 标准算法 | 65.34 | 90.37 | 74.48 | 63.25 | 68.59 | 69.32 | 91.85 | 77.96 | 66.57 | 73.48 |
Shifting算法 | 66.80 | 90.43 | 75.74 | 65.15 | 70.28 | 70.84 | 91.99 | 78.90 | 68.09 | 75.00 | ||
DARK算法 | 68.40 | 91.38 | 76.89 | 66.60 | 71.59 | 72.01 | 92.07 | 79.72 | 69.30 | 76.14 | ||
本文算法 | 70.63 | 91.40 | 78.17 | 68.27 | 74.66 | 74.11 | 92.24 | 80.81 | 70.98 | 78.85 | ||
384×288 | 标准算法 | 69.85 | 91.46 | 77.07 | 66.86 | 74.66 | 73.28 | 92.48 | 79.83 | 69.55 | 78.80 | |
Shifting算法 | 70.71 | 91.47 | 78.01 | 67.45 | 75.55 | 73.96 | 92.51 | 80.26 | 70.18 | 79.56 | ||
DARK算法 | 71.49 | 91.47 | 78.20 | 68.43 | 76.50 | 74.71 | 92.66 | 80.79 | 70.93 | 80.35 | ||
本文算法 | 72.92 | 91.52 | 79.41 | 69.20 | 78.45 | 75.80 | 92.87 | 81.72 | 71.72 | 81.86 | ||
ResNet-101 | 256×192 | 标准算法 | 66.60 | 91.45 | 75.77 | 65.21 | 69.60 | 70.54 | 92.46 | 78.84 | 68.04 | 74.35 |
Shifting算法 | 68.43 | 91.44 | 77.89 | 66.77 | 71.40 | 72.06 | 92.44 | 80.05 | 69.60 | 75.86 | ||
DARK算法 | 69.30 | 91.48 | 78.08 | 67.85 | 72.60 | 73.13 | 92.66 | 80.72 | 70.66 | 76.99 | ||
本文算法 | 71.98 | 92.48 | 79.32 | 69.60 | 75.73 | 75.31 | 93.15 | 81.85 | 72.44 | 79.73 | ||
384×288 | 标准算法 | 71.63 | 92.44 | 80.19 | 69.04 | 76.02 | 75.07 | 93.25 | 82.24 | 71.75 | 80.12 | |
Shifting算法 | 72.42 | 92.45 | 80.25 | 69.78 | 76.66 | 75.76 | 93.26 | 82.51 | 72.49 | 80.75 | ||
DARK算法 | 73.22 | 92.47 | 80.35 | 70.70 | 77.68 | 76.51 | 93.31 | 82.97 | 73.20 | 81.56 | ||
本文算法 | 74.52 | 92.47 | 81.40 | 71.44 | 79.40 | 77.55 | 93.42 | 83.61 | 73.97 | 82.99 | ||
ResNet-152 | 256×192 | 标准算法 | 67.42 | 91.48 | 76.75 | 65.51 | 70.85 | 71.26 | 92.66 | 79.83 | 68.63 | 75.28 |
Shifting算法 | 68.86 | 91.52 | 77.86 | 67.10 | 72.23 | 72.60 | 92.85 | 80.68 | 70.02 | 76.55 | ||
DARK算法 | 70.17 | 92.47 | 78.93 | 68.17 | 73.59 | 73.74 | 93.03 | 81.27 | 71.13 | 77.77 | ||
本文算法 | 72.75 | 92.51 | 80.34 | 70.00 | 76.84 | 75.95 | 93.14 | 82.68 | 72.84 | 80.68 | ||
384×288 | 标准算法 | 72.83 | 92.50 | 81.38 | 70.24 | 76.99 | 76.15 | 93.64 | 83.50 | 72.95 | 81.00 | |
Shifting算法 | 73.51 | 92.52 | 81.47 | 70.96 | 77.74 | 76.80 | 93.73 | 83.80 | 73.60 | 81.67 | ||
DARK算法 | 74.26 | 92.54 | 82.44 | 71.88 | 78.63 | 77.50 | 93.77 | 84.32 | 74.34 | 82.31 | ||
本文算法 | 75.48 | 92.54 | 82.59 | 72.57 | 80.33 | 78.50 | 93.84 | 84.70 | 75.05 | 83.75 | ||
HR-W32 | 256×192 | 标准算法 | 69.66 | 92.49 | 79.02 | 67.87 | 73.16 | 73.42 | 93.77 | 81.99 | 70.79 | 77.48 |
Shifting算法 | 71.33 | 92.49 | 81.11 | 69.63 | 74.68 | 74.85 | 93.78 | 83.01 | 72.21 | 78.95 | ||
DARK算法 | 72.74 | 92.51 | 81.41 | 70.85 | 76.57 | 76.24 | 93.83 | 83.82 | 73.46 | 80.53 | ||
本文算法 | 75.47 | 93.49 | 83.50 | 72.86 | 79.52 | 78.35 | 94.05 | 85.11 | 75.26 | 83.13 | ||
384×288 | 标准算法 | 73.53 | 92.54 | 82.21 | 71.24 | 77.74 | 76.94 | 93.88 | 84.15 | 73.69 | 81.92 | |
Shifting算法 | 74.45 | 92.54 | 82.33 | 71.84 | 78.62 | 77.69 | 93.92 | 84.49 | 74.45 | 82.66 | ||
DARK算法 | 75.75 | 93.55 | 83.33 | 73.05 | 79.92 | 78.71 | 94.16 | 85.06 | 75.45 | 83.72 | ||
本文算法 | 77.00 | 93.54 | 83.67 | 73.86 | 81.86 | 79.71 | 94.14 | 85.64 | 76.17 | 85.13 | ||
HR-W48 | 256×192 | 标准算法 | 69.86 | 92.48 | 79.79 | 68.12 | 73.31 | 73.70 | 93.73 | 82.31 | 70.90 | 77.92 |
Shifting算法 | 71.53 | 92.50 | 81.03 | 69.56 | 75.05 | 75.23 | 93.78 | 83.28 | 72.38 | 79.55 | ||
DARK算法 | 72.84 | 92.52 | 82.11 | 71.18 | 76.36 | 76.51 | 93.86 | 84.18 | 73.70 | 80.81 | ||
本文算法 | 75.70 | 93.50 | 83.56 | 73.05 | 79.92 | 78.71 | 94.07 | 85.53 | 75.44 | 83.68 | ||
384×288 | 标准算法 | 74.42 | 93.48 | 82.41 | 71.72 | 78.60 | 77.60 | 94.05 | 84.65 | 74.41 | 82.49 | |
Shifting算法 | 75.18 | 93.48 | 82.53 | 72.54 | 79.39 | 78.28 | 94.11 | 84.93 | 75.11 | 83.16 | ||
DARK算法 | 76.15 | 93.50 | 83.69 | 73.59 | 80.46 | 79.15 | 94.11 | 85.67 | 75.99 | 84.02 | ||
本文算法 | 77.23 | 93.52 | 83.74 | 74.15 | 82.25 | 80.07 | 94.24 | 85.97 | 76.61 | 85.41 |
表2 COCO验证集上的精度(验证过程不采用flip策略)
Tab. 2 Accuracy on COCO validation dataset (validation process without flip strategy)
模型 | 输入分辨率 | 算法 | 精度/% | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
AP | AP50 | AP75 | APM | APL | AR | AR50 | AR75 | ARM | ARL | |||
ResNet-50 | 256×192 | 标准算法 | 65.34 | 90.37 | 74.48 | 63.25 | 68.59 | 69.32 | 91.85 | 77.96 | 66.57 | 73.48 |
Shifting算法 | 66.80 | 90.43 | 75.74 | 65.15 | 70.28 | 70.84 | 91.99 | 78.90 | 68.09 | 75.00 | ||
DARK算法 | 68.40 | 91.38 | 76.89 | 66.60 | 71.59 | 72.01 | 92.07 | 79.72 | 69.30 | 76.14 | ||
本文算法 | 70.63 | 91.40 | 78.17 | 68.27 | 74.66 | 74.11 | 92.24 | 80.81 | 70.98 | 78.85 | ||
384×288 | 标准算法 | 69.85 | 91.46 | 77.07 | 66.86 | 74.66 | 73.28 | 92.48 | 79.83 | 69.55 | 78.80 | |
Shifting算法 | 70.71 | 91.47 | 78.01 | 67.45 | 75.55 | 73.96 | 92.51 | 80.26 | 70.18 | 79.56 | ||
DARK算法 | 71.49 | 91.47 | 78.20 | 68.43 | 76.50 | 74.71 | 92.66 | 80.79 | 70.93 | 80.35 | ||
本文算法 | 72.92 | 91.52 | 79.41 | 69.20 | 78.45 | 75.80 | 92.87 | 81.72 | 71.72 | 81.86 | ||
ResNet-101 | 256×192 | 标准算法 | 66.60 | 91.45 | 75.77 | 65.21 | 69.60 | 70.54 | 92.46 | 78.84 | 68.04 | 74.35 |
Shifting算法 | 68.43 | 91.44 | 77.89 | 66.77 | 71.40 | 72.06 | 92.44 | 80.05 | 69.60 | 75.86 | ||
DARK算法 | 69.30 | 91.48 | 78.08 | 67.85 | 72.60 | 73.13 | 92.66 | 80.72 | 70.66 | 76.99 | ||
本文算法 | 71.98 | 92.48 | 79.32 | 69.60 | 75.73 | 75.31 | 93.15 | 81.85 | 72.44 | 79.73 | ||
384×288 | 标准算法 | 71.63 | 92.44 | 80.19 | 69.04 | 76.02 | 75.07 | 93.25 | 82.24 | 71.75 | 80.12 | |
Shifting算法 | 72.42 | 92.45 | 80.25 | 69.78 | 76.66 | 75.76 | 93.26 | 82.51 | 72.49 | 80.75 | ||
DARK算法 | 73.22 | 92.47 | 80.35 | 70.70 | 77.68 | 76.51 | 93.31 | 82.97 | 73.20 | 81.56 | ||
本文算法 | 74.52 | 92.47 | 81.40 | 71.44 | 79.40 | 77.55 | 93.42 | 83.61 | 73.97 | 82.99 | ||
ResNet-152 | 256×192 | 标准算法 | 67.42 | 91.48 | 76.75 | 65.51 | 70.85 | 71.26 | 92.66 | 79.83 | 68.63 | 75.28 |
Shifting算法 | 68.86 | 91.52 | 77.86 | 67.10 | 72.23 | 72.60 | 92.85 | 80.68 | 70.02 | 76.55 | ||
DARK算法 | 70.17 | 92.47 | 78.93 | 68.17 | 73.59 | 73.74 | 93.03 | 81.27 | 71.13 | 77.77 | ||
本文算法 | 72.75 | 92.51 | 80.34 | 70.00 | 76.84 | 75.95 | 93.14 | 82.68 | 72.84 | 80.68 | ||
384×288 | 标准算法 | 72.83 | 92.50 | 81.38 | 70.24 | 76.99 | 76.15 | 93.64 | 83.50 | 72.95 | 81.00 | |
Shifting算法 | 73.51 | 92.52 | 81.47 | 70.96 | 77.74 | 76.80 | 93.73 | 83.80 | 73.60 | 81.67 | ||
DARK算法 | 74.26 | 92.54 | 82.44 | 71.88 | 78.63 | 77.50 | 93.77 | 84.32 | 74.34 | 82.31 | ||
本文算法 | 75.48 | 92.54 | 82.59 | 72.57 | 80.33 | 78.50 | 93.84 | 84.70 | 75.05 | 83.75 | ||
HR-W32 | 256×192 | 标准算法 | 69.66 | 92.49 | 79.02 | 67.87 | 73.16 | 73.42 | 93.77 | 81.99 | 70.79 | 77.48 |
Shifting算法 | 71.33 | 92.49 | 81.11 | 69.63 | 74.68 | 74.85 | 93.78 | 83.01 | 72.21 | 78.95 | ||
DARK算法 | 72.74 | 92.51 | 81.41 | 70.85 | 76.57 | 76.24 | 93.83 | 83.82 | 73.46 | 80.53 | ||
本文算法 | 75.47 | 93.49 | 83.50 | 72.86 | 79.52 | 78.35 | 94.05 | 85.11 | 75.26 | 83.13 | ||
384×288 | 标准算法 | 73.53 | 92.54 | 82.21 | 71.24 | 77.74 | 76.94 | 93.88 | 84.15 | 73.69 | 81.92 | |
Shifting算法 | 74.45 | 92.54 | 82.33 | 71.84 | 78.62 | 77.69 | 93.92 | 84.49 | 74.45 | 82.66 | ||
DARK算法 | 75.75 | 93.55 | 83.33 | 73.05 | 79.92 | 78.71 | 94.16 | 85.06 | 75.45 | 83.72 | ||
本文算法 | 77.00 | 93.54 | 83.67 | 73.86 | 81.86 | 79.71 | 94.14 | 85.64 | 76.17 | 85.13 | ||
HR-W48 | 256×192 | 标准算法 | 69.86 | 92.48 | 79.79 | 68.12 | 73.31 | 73.70 | 93.73 | 82.31 | 70.90 | 77.92 |
Shifting算法 | 71.53 | 92.50 | 81.03 | 69.56 | 75.05 | 75.23 | 93.78 | 83.28 | 72.38 | 79.55 | ||
DARK算法 | 72.84 | 92.52 | 82.11 | 71.18 | 76.36 | 76.51 | 93.86 | 84.18 | 73.70 | 80.81 | ||
本文算法 | 75.70 | 93.50 | 83.56 | 73.05 | 79.92 | 78.71 | 94.07 | 85.53 | 75.44 | 83.68 | ||
384×288 | 标准算法 | 74.42 | 93.48 | 82.41 | 71.72 | 78.60 | 77.60 | 94.05 | 84.65 | 74.41 | 82.49 | |
Shifting算法 | 75.18 | 93.48 | 82.53 | 72.54 | 79.39 | 78.28 | 94.11 | 84.93 | 75.11 | 83.16 | ||
DARK算法 | 76.15 | 93.50 | 83.69 | 73.59 | 80.46 | 79.15 | 94.11 | 85.67 | 75.99 | 84.02 | ||
本文算法 | 77.23 | 93.52 | 83.74 | 74.15 | 82.25 | 80.07 | 94.24 | 85.97 | 76.61 | 85.41 |
模型 | 算法 | PCKh | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
Head | Shoul. | Elbow | Wrist | Hip | Knee | Ankle | PCKh0.1 | PCKh0.5 | ||
ResNet-50 | 标准算法 | 96.04 | 94.19 | 87.25 | 81.34 | 86.15 | 81.60 | 78.32 | 21.55 | 86.99 |
Shifting算法 | 96.04 | 94.34 | 87.35 | 81.53 | 86.41 | 81.85 | 78.48 | 23.40 | 87.15 | |
DARK算法 | 96.15 | 94.53 | 87.76 | 81.87 | 86.76 | 82.49 | 78.81 | 24.48 | 87.48 | |
本文算法 | 95.87 | 94.87 | 88.44 | 82.05 | 87.62 | 83.22 | 79.48 | 31.69 | 87.95 | |
ResNet-101 | 标准算法 | 96.35 | 94.62 | 87.40 | 82.41 | 85.72 | 82.35 | 78.77 | 22.07 | 87.36 |
Shifting算法 | 96.59 | 94.58 | 87.69 | 82.39 | 86.22 | 82.71 | 78.98 | 23.66 | 87.56 | |
DARK算法 | 96.32 | 94.72 | 88.07 | 82.85 | 86.71 | 83.16 | 79.24 | 24.82 | 87.85 | |
本文算法 | 96.28 | 94.80 | 88.55 | 83.42 | 87.54 | 83.42 | 79.74 | 32.09 | 88.25 | |
ResNet-152 | 标准算法 | 96.62 | 95.02 | 88.27 | 82.70 | 86.38 | 83.30 | 79.85 | 22.55 | 87.98 |
Shifting算法 | 96.62 | 95.31 | 88.56 | 82.99 | 86.91 | 83.58 | 79.83 | 24.31 | 88.23 | |
DARK算法 | 96.73 | 95.33 | 88.80 | 83.66 | 87.02 | 83.78 | 80.63 | 25.28 | 88.50 | |
本文算法 | 96.56 | 95.67 | 88.97 | 83.85 | 87.99 | 84.14 | 80.52 | 33.07 | 88.78 | |
HR-W32 | 标准算法 | 96.79 | 95.06 | 89.08 | 84.29 | 86.01 | 84.40 | 81.39 | 23.49 | 88.61 |
Shifting算法 | 96.93 | 95.25 | 89.06 | 84.39 | 86.43 | 84.89 | 81.58 | 25.36 | 88.81 | |
DARK算法 | 96.97 | 95.40 | 89.57 | 85.03 | 87.04 | 85.67 | 82.03 | 27.38 | 89.25 | |
本文算法 | 96.86 | 95.58 | 89.98 | 85.49 | 87.83 | 86.18 | 82.59 | 35.80 | 89.67 |
表3 MPII验证集上的PCKh精度(验证过程不采用flip策略) ( %)
Tab. 3 PCKh accuracy on MPII dataset (validation process without flip strategy)
模型 | 算法 | PCKh | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
Head | Shoul. | Elbow | Wrist | Hip | Knee | Ankle | PCKh0.1 | PCKh0.5 | ||
ResNet-50 | 标准算法 | 96.04 | 94.19 | 87.25 | 81.34 | 86.15 | 81.60 | 78.32 | 21.55 | 86.99 |
Shifting算法 | 96.04 | 94.34 | 87.35 | 81.53 | 86.41 | 81.85 | 78.48 | 23.40 | 87.15 | |
DARK算法 | 96.15 | 94.53 | 87.76 | 81.87 | 86.76 | 82.49 | 78.81 | 24.48 | 87.48 | |
本文算法 | 95.87 | 94.87 | 88.44 | 82.05 | 87.62 | 83.22 | 79.48 | 31.69 | 87.95 | |
ResNet-101 | 标准算法 | 96.35 | 94.62 | 87.40 | 82.41 | 85.72 | 82.35 | 78.77 | 22.07 | 87.36 |
Shifting算法 | 96.59 | 94.58 | 87.69 | 82.39 | 86.22 | 82.71 | 78.98 | 23.66 | 87.56 | |
DARK算法 | 96.32 | 94.72 | 88.07 | 82.85 | 86.71 | 83.16 | 79.24 | 24.82 | 87.85 | |
本文算法 | 96.28 | 94.80 | 88.55 | 83.42 | 87.54 | 83.42 | 79.74 | 32.09 | 88.25 | |
ResNet-152 | 标准算法 | 96.62 | 95.02 | 88.27 | 82.70 | 86.38 | 83.30 | 79.85 | 22.55 | 87.98 |
Shifting算法 | 96.62 | 95.31 | 88.56 | 82.99 | 86.91 | 83.58 | 79.83 | 24.31 | 88.23 | |
DARK算法 | 96.73 | 95.33 | 88.80 | 83.66 | 87.02 | 83.78 | 80.63 | 25.28 | 88.50 | |
本文算法 | 96.56 | 95.67 | 88.97 | 83.85 | 87.99 | 84.14 | 80.52 | 33.07 | 88.78 | |
HR-W32 | 标准算法 | 96.79 | 95.06 | 89.08 | 84.29 | 86.01 | 84.40 | 81.39 | 23.49 | 88.61 |
Shifting算法 | 96.93 | 95.25 | 89.06 | 84.39 | 86.43 | 84.89 | 81.58 | 25.36 | 88.81 | |
DARK算法 | 96.97 | 95.40 | 89.57 | 85.03 | 87.04 | 85.67 | 82.03 | 27.38 | 89.25 | |
本文算法 | 96.86 | 95.58 | 89.98 | 85.49 | 87.83 | 86.18 | 82.59 | 35.80 | 89.67 |
算法 | 分辨率 | 高斯平滑 | ResNet-50 | ResNet-101 | ResNet-152 | HR-W32 | ||||
---|---|---|---|---|---|---|---|---|---|---|
精度/% | 相差百分点 | 精度/% | 相差百分点 | 精度/% | 相差百分点 | 精度/% | 相差百分点 | |||
DARK算法 | 256×192 | 68.40 | 0.52↓ | 69.30 | 0.09↓ | 70.17 | 0.26↓ | 72.74 | 0.50↓ | |
67.88 | 69.21 | 69.91 | 72.24 | |||||||
384×288 | 71.49 | 0.41↓ | 73.22 | 0.24↓ | 74.26 | 0.13↓ | 75.75 | 0.75↓ | ||
71.08 | 72.98 | 74.13 | 75.00 | |||||||
本文算法 | 256×192 | 70.69 | 0.06↓ | 71.96 | 0.02↑ | 72.76 | 0.01↓ | 75.43 | 0.04↑ | |
70.63 | 71.98 | 72.75 | 75.47 | |||||||
384×288 | 72.84 | 0.08↑ | 74.53 | 0.01↓ | 75.31 | 0.17↑ | 77.01 | 0.01↓ | ||
72.92 | 74.52 | 75.48 | 77.00 |
表4 分辨率、高斯平滑和主干网络对算法精度的影响
Tab. 4 Influence of resolution, Gaussian smoothing and backbone network on algorithm accuracy
算法 | 分辨率 | 高斯平滑 | ResNet-50 | ResNet-101 | ResNet-152 | HR-W32 | ||||
---|---|---|---|---|---|---|---|---|---|---|
精度/% | 相差百分点 | 精度/% | 相差百分点 | 精度/% | 相差百分点 | 精度/% | 相差百分点 | |||
DARK算法 | 256×192 | 68.40 | 0.52↓ | 69.30 | 0.09↓ | 70.17 | 0.26↓ | 72.74 | 0.50↓ | |
67.88 | 69.21 | 69.91 | 72.24 | |||||||
384×288 | 71.49 | 0.41↓ | 73.22 | 0.24↓ | 74.26 | 0.13↓ | 75.75 | 0.75↓ | ||
71.08 | 72.98 | 74.13 | 75.00 | |||||||
本文算法 | 256×192 | 70.69 | 0.06↓ | 71.96 | 0.02↑ | 72.76 | 0.01↓ | 75.43 | 0.04↑ | |
70.63 | 71.98 | 72.75 | 75.47 | |||||||
384×288 | 72.84 | 0.08↑ | 74.53 | 0.01↓ | 75.31 | 0.17↑ | 77.01 | 0.01↓ | ||
72.92 | 74.52 | 75.48 | 77.00 |
算法 | 每帧额外耗时 |
---|---|
Shifting算法 | 0.3 |
DARK算法 | 3.0 |
本文算法 | 1.4 |
表5 不同算法相较于标准算法的额外运行时间 ( ms)
Tab. 5 Extra running time of different methods beyond standard method
算法 | 每帧额外耗时 |
---|---|
Shifting算法 | 0.3 |
DARK算法 | 3.0 |
本文算法 | 1.4 |
1 | RUMELHART D E, HINTON G E, WILLIAMS R J. Learning internal representations by error propagation[M]// RUMELHART D E, McCLELLAND J L, PDP Research Group. Parallel Distributed Processing: Explorations in the Microstructure of Cognition: Foundations. Cambridge: MIT Press, 1986: 318-362. |
2 | SUN K, XIAO B, LIU D, et al. Deep high-resolution representation learning for human pose estimation [C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 5686-5696. 10.1109/cvpr.2019.00584 |
3 | LI W B, WANG Z C, YIN B Y, et al. Rethinking on multi-stage networks for human pose estimation[EB/OL]. (2019-05-30) [2021-03-20]. . |
4 | CHEN Y L, WANG Z C, PENG Y X, et al. Cascaded pyramid network for multi-person pose estimation [C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 7103-7112. 10.1109/cvpr.2018.00742 |
5 | LIFSHITZ I, FETAYA E, ULLMAN S. Human pose estimation using deep consensus voting [C]// Proceedings of the 2016 European Conference on Computer Vision, LNCS 9906. Cham: Springer, 2016: 246-260. |
6 | ZHOU X Y, WANG D Q, KRÄHENBÜHL P. Objects as points[EB/OL]. (2019-04-25) [2021-03-20]. . |
7 | DENG J, DONG W, SOCHER R, et al. ImageNet: a large-scale hierarchical image database [C]// Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2009: 248-255. 10.1109/cvpr.2009.5206848 |
8 | HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition [C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 770-778. 10.1109/cvpr.2016.90 |
9 | SZEGEDY C, LIU W, JIA Y Q, et al. Going deeper with convolutions [C]// Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2015: 1-9. 10.1109/cvpr.2015.7298594 |
10 | HE K M, GKIOXARI G, DOLLÁR P, et al. Mask R-CNN [C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 2980-2988. 10.1109/iccv.2017.322 |
11 | REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection [C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 779-788. 10.1109/cvpr.2016.91 |
12 | LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot multiBox detector [C]// Proceedings of the 2016 European Conference on Computer Vision, LNCS 9905. Cham: Springer, 2016: 21-37. |
13 | CHEN L C, PAPANDREOU G, KOKKINOS I, et al. DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4): 834-848. 10.1109/tpami.2017.2699184 |
14 | RONNEBERGER O, FISCHER P, BROX T. U-net: convolutional networks for biomedical image segmentation [C]// Proceedings of the 2015 Medical Image Computing and Computer-Assisted Intervention, LNCS 9351. Cham: Springer, 2015: 234-241. |
15 | NEWELL A, YANG K Y, DENG J. Stacked hourglass networks for human pose estimation [C]// Proceedings of the 2016 European Conference on Computer Vision, LNCS 9912. Cham: Springer, 2016: 483-499. |
16 | ZHANG F, ZHU X T, DAI H B, et al. Distribution-aware coordinate representation for human pose estimation [C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 7091-7100. 10.1109/cvpr42600.2020.00712 |
17 | XIAO B, WU H P, WEI Y C. Simple baselines for human pose estimation and tracking [C]// Proceedings of the 2018 European Conference on Computer Vision, LNCS 11210. Cham: Springer, 2018: 472-487. |
18 | WEI S E, RAMAKRISHNA V, KANADE T, et al. Convolutional pose machines [C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 4724-4732. 10.1109/cvpr.2016.511 |
19 | CAO Z, SIMON T, WEI S E, et al. Realtime multi-person 2D pose estimation using part affinity fields [C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 1302-1310. 10.1109/cvpr.2017.143 |
20 | NEWELL A, HUANG Z A, DENG J. Associative embedding: end-to-end learning for joint detection and grouping [C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook, NY: Curran Associates Inc., 2017: 2274-2284. |
21 | SEKII T. Pose proposal networks [C]// Proceedings of the 2018 European Conference on Computer Vision, LNCS 11217. Cham: Springer, 2018: 350-366. |
22 | 张小娜,吴庆涛.基于深度学习的自顶向下人体姿态估计算法[J].电子测量技术, 2021, 44(9): 105-109. |
ZHANG X N, WU Q T. Top-down human pose estimation algorithm based on deep learning[J]. Electronic Measurement Technology, 2021, 44(9): 105-109. | |
23 | 田宇.基于卷积神经网络的人体姿态估计算法研究[D].天津:天津理工大学, 2021: 35-46. |
TIAN Y. Body posture estimation on convolutional neural network[D]. Tianjin: Tianjin University of Technology, 2021: 35-46. | |
24 | TOMPSON J, JAIN A, LeCUN Y, et al. Joint training of a convolutional network and a graphical model for human pose estimation [C]// Proceedings of the 27th International Conference on Neural Information Processing Systems. Cambridge: MIT Press, 2014: 1799-1807. |
25 | LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: common objects in context [C]// Proceedings of the 2014 European Conference on Computer Vision, LNCS 8693. Cham: Springer, 2014: 740-755. |
26 | ANDRILUKA M, PISHCHULIN L, GEHLER P, et al. 2D human pose estimation: new benchmark and state of the art analysis [C]// Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2014: 3686-3693. 10.1109/cvpr.2014.471 |
[1] | 张显杰, 张之明. 基于卷积神经网络和Transformer的手写体英文文本识别[J]. 《计算机应用》唯一官方网站, 2022, 42(8): 2394-2400. |
[2] | 程南江, 余贞侠, 陈琳, 乔贺辙. 基于领域自适应的多源多标签行人属性识别[J]. 《计算机应用》唯一官方网站, 2022, 42(8): 2401-2406. |
[3] | 李坤, 侯庆. 基于注意力机制的轻量型人体姿态估计[J]. 《计算机应用》唯一官方网站, 2022, 42(8): 2407-2414. |
[4] | 吕振虎, 许新征, 张芳艳. 基于挤压激励的轻量化注意力机制模块[J]. 《计算机应用》唯一官方网站, 2022, 42(8): 2353-2360. |
[5] | 王晓雨, 王展青, 熊威. 深度非对称离散跨模态哈希方法[J]. 《计算机应用》唯一官方网站, 2022, 42(8): 2461-2470. |
[6] | 杨博, 张恒巍, 李哲铭, 徐开勇. 基于图像翻转变换的对抗样本生成方法[J]. 《计算机应用》唯一官方网站, 2022, 42(8): 2319-2325. |
[7] | 玄英律, 万源, 陈嘉慧. 基于多尺度卷积和注意力机制的LSTM时间序列分类[J]. 《计算机应用》唯一官方网站, 2022, 42(8): 2343-2352. |
[8] | 王海起, 王志海, 李留珂, 孔浩然, 王琼, 徐建波. 基于网格划分的城市短时交通流量时空预测模型[J]. 《计算机应用》唯一官方网站, 2022, 42(7): 2274-2280. |
[9] | 陈荣源, 姚剑敏, 严群, 林志贤. 基于深度神经网络的视频播放速度识别[J]. 《计算机应用》唯一官方网站, 2022, 42(7): 2043-2051. |
[10] | 王震宇, 张雷, 高文彬, 权威铭. 基于渐进式神经网络架构搜索的人体运动识别[J]. 《计算机应用》唯一官方网站, 2022, 42(7): 2058-2064. |
[11] | 李晓寒, 王俊, 贾华丁, 萧刘. 基于多重注意力机制的图神经网络股市波动预测方法[J]. 《计算机应用》唯一官方网站, 2022, 42(7): 2265-2273. |
[12] | 刘尚旺, 张新明, 张非. 改进字体自适应神经网络的图像字符编辑方法[J]. 《计算机应用》唯一官方网站, 2022, 42(7): 2227-2238. |
[13] | 毛文涛, 吴桂芳, 吴超, 窦智. 基于中国写意风格迁移的动漫视频生成模型[J]. 《计算机应用》唯一官方网站, 2022, 42(7): 2162-2169. |
[14] | 谭湘粤, 胡晓, 杨佳信, 向俊将. 基于递进式特征增强聚合的伪装目标检测[J]. 《计算机应用》唯一官方网站, 2022, 42(7): 2192-2200. |
[15] | 董宁, 程晓荣, 张铭泉. 基于物联网平台的动态权重损失函数入侵检测系统[J]. 《计算机应用》唯一官方网站, 2022, 42(7): 2118-2124. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||