《计算机应用》唯一官方网站 ›› 2023, Vol. 43 ›› Issue (2): 583-588.DOI: 10.11772/j.issn.1001-9081.2021122075
所属专题: 多媒体计算与计算机仿真
收稿日期:
2021-12-09
修回日期:
2022-02-20
接受日期:
2022-02-23
发布日期:
2023-02-08
出版日期:
2023-02-10
通讯作者:
刘翠响
作者简介:
苏亚婷(1995—),女,河北石家庄人,硕士研究生,主要研究方向:信息感知、机器学习;
基金资助:
Received:
2021-12-09
Revised:
2022-02-20
Accepted:
2022-02-23
Online:
2023-02-08
Published:
2023-02-10
Contact:
Cuixiang LIU
About author:
SU Yating, born in 1995, M. S. candidate. Her research interests include information perception, machine learning.
Supported by:
摘要:
针对单目图像重建人体时出现的头部姿态翻转和图像特征间隐式空间线索缺失的问题,提出了一种基于高分辨率网络(HRNet)和图卷积网络(GCN)的三维人体重建模型。首先利用HRNet和残差块作为主干网络从原始图像中提取丰富的人体特征信息,然后使用GCN来捕获特征之间隐式的空间线索以获得空间精确的特征表示,最后使用此特征来预测多人线性蒙皮模型(SMPL)的参数以得到更加准确的重建结果;同时为了有效解决人体头部姿态翻转的问题,对SMPL的关节点重新进行了定义,在原有关节的基础上增加对头部关节点的定义。实验结果表明,所提模型能够准确地重建出三维人体,在2D数据集LSP上的重建准确率达到了92.41%,在3D数据集MPI-INF-3DHP上的关节误差和重建误差也大幅降低,平均误差仅分别为97.73 mm和64.63 mm,验证了所提模型在人体重建领域的有效性。
中图分类号:
苏亚婷, 刘翠响. 基于高分辨率网络和图卷积网络的三维人体重建模型[J]. 计算机应用, 2023, 43(2): 583-588.
Yating SU, Cuixiang LIU. Three-dimensional human reconstruction model based on high-resolution net and graph convolutional network[J]. Journal of Computer Applications, 2023, 43(2): 583-588.
模型 | F1 | 准确率 | 模型 | F1 | 准确率 |
---|---|---|---|---|---|
SMPLify | 84.90 | 90.56 | CMR | 87.10 | 91.55 |
HMR | 86.95 | 91.02 | 本文模型 | 88.03 | 92.41 |
表1 重建性能比较 (%)
Tab. 1 Reconstruction performance comparison
模型 | F1 | 准确率 | 模型 | F1 | 准确率 |
---|---|---|---|---|---|
SMPLify | 84.90 | 90.56 | CMR | 87.10 | 91.55 |
HMR | 86.95 | 91.02 | 本文模型 | 88.03 | 92.41 |
视频帧 | SMPLify | HMR | CMR | 本文 |
---|---|---|---|---|
平均 | 943.57 | 235.73 | 181.80 | 97.73 |
TS1 | 844.13 | 187.09 | 145.70 | 63.36 |
TS2 | 897.08 | 283.63 | 172.57 | 89.76 |
TS3 | 1 059.01 | 251.29 | 160.07 | 91.96 |
TS4 | 974.92 | 265.72 | 233.98 | 106.48 |
TS5 | 856.23 | 172.19 | 208.39 | 116.07 |
TS6 | 1 030.02 | 254.45 | 170.09 | 118.73 |
表2 MPJPE误差结果 (mm)
Tab. 2 MPJPE error results
视频帧 | SMPLify | HMR | CMR | 本文 |
---|---|---|---|---|
平均 | 943.57 | 235.73 | 181.80 | 97.73 |
TS1 | 844.13 | 187.09 | 145.70 | 63.36 |
TS2 | 897.08 | 283.63 | 172.57 | 89.76 |
TS3 | 1 059.01 | 251.29 | 160.07 | 91.96 |
TS4 | 974.92 | 265.72 | 233.98 | 106.48 |
TS5 | 856.23 | 172.19 | 208.39 | 116.07 |
TS6 | 1 030.02 | 254.45 | 170.09 | 118.73 |
视频帧 | SMPLify | HMR | CMR | 本文 |
---|---|---|---|---|
平均 | 138.85 | 130.63 | 97.38 | 64.63 |
TS1 | 171.14 | 102.07 | 75.29 | 41.72 |
TS2 | 145.51 | 132.44 | 112.70 | 60.29 |
TS3 | 123.27 | 142.19 | 91.94 | 58.60 |
TS4 | 135.35 | 152.72 | 110.51 | 66.00 |
TS5 | 138.76 | 108.19 | 85.66 | 73.86 |
TS6 | 119.09 | 146.15 | 108.15 | 87.31 |
表3 重建误差结果 (mm)
Tab. 3 Reconstruction error results
视频帧 | SMPLify | HMR | CMR | 本文 |
---|---|---|---|---|
平均 | 138.85 | 130.63 | 97.38 | 64.63 |
TS1 | 171.14 | 102.07 | 75.29 | 41.72 |
TS2 | 145.51 | 132.44 | 112.70 | 60.29 |
TS3 | 123.27 | 142.19 | 91.94 | 58.60 |
TS4 | 135.35 | 152.72 | 110.51 | 66.00 |
TS5 | 138.76 | 108.19 | 85.66 | 73.86 |
TS6 | 119.09 | 146.15 | 108.15 | 87.31 |
层数N | 重建 误差/mm | MPJPE/mm | 层数N | 重建 误差/mm | MPJPE/mm |
---|---|---|---|---|---|
0 | 181.52 | 447.78 | 3 | 105.78 | 180.05 |
1 | 183.90 | 259.01 | 4 | 81.84 | 117.69 |
2 | 122.02 | 224.78 | 5 | 55.61 | 88.60 |
表4 MPI-INF-3DPH数据集上的消融实验
Tab. 4 Ablation experiment on MPI-INF-3DPH dataset
层数N | 重建 误差/mm | MPJPE/mm | 层数N | 重建 误差/mm | MPJPE/mm |
---|---|---|---|---|---|
0 | 181.52 | 447.78 | 3 | 105.78 | 180.05 |
1 | 183.90 | 259.01 | 4 | 81.84 | 117.69 |
2 | 122.02 | 224.78 | 5 | 55.61 | 88.60 |
头部姿态约束 | 重建误差 | ||
---|---|---|---|
TS2 | TS4 | TS6 | |
无 | 112.49 | 114.12 | 136.36 |
有 | 63.48 | 66.88 | 89.95 |
表5 头部关节对重建误差的影响 (mm)
Tab. 5 Influence of head joints on reconstruction error
头部姿态约束 | 重建误差 | ||
---|---|---|---|
TS2 | TS4 | TS6 | |
无 | 112.49 | 114.12 | 136.36 |
有 | 63.48 | 66.88 | 89.95 |
1 | 杨继魁. 基于Kinect单次拍摄数据准确估计人体全身体型与姿态的研究[D]. 合肥:安徽大学, 2019:10-16. |
YANG J K. Accurately estimating the whole body shape and pose of human body based on Kinect single shot data[D]. Hefei: Anhui University, 2019: 10-16. | |
2 | LOPER M, MAHMOOD N, ROMERO J, et al. SMPL: a skinned multi-person linear model[J]. ACM Transactions on Graphics, 2015, 34(6): No.248. 10.1145/2816795.2818013 |
3 | SUN K, XIAO B, LIU D, et al. Deep high-resolution representation learning for human pose estimation[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 5686-5696. 10.1109/cvpr.2019.00584 |
4 | HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 770-778. 10.1109/cvpr.2016.90 |
5 | 张亚凤,刘翠响,马杰,等. 基于多特征点匹配的三维人体姿态重建[J]. 激光与光电子学进展, 2022, 59(16):325-332. 10.3788/lop202259.1615003 |
ZHANG Y F, LIU C X, MA J, et al. Three-dimensional human pose reconstruction based on multifeature point matching[J]. Laser and Optoelectronics Progress, 2022, 59(16):325-332. 10.3788/lop202259.1615003 | |
6 | ANGUELOV D, SRINIVASAN P, KOLLER D, et al. SCAPE: shape completion and animation of people[J]. ACM Transactions on Graphics, 2005, 24(3): 408-416. 10.1145/1073204.1073207 |
7 | GUAN P, WEISS A, BĂLAN A O, et al. Estimating human shape and pose from a single image[C]// Proceedings of the IEEE 12th International Conference on Computer Vision. Piscataway: IEEE, 2009: 1381-1388. 10.1109/iccv.2009.5459300 |
8 | BĂLAN A O, SIGAL L, BLACK M J, et al. Detailed human shape and pose from images[C]// Proceedings of the 2007 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2007: 1-8. 10.1109/cvpr.2007.383340 |
9 | BOBO F, KANAZAWA A, LASSNER C, et al. Keep it SMPL: automatic estimation of 3D human pose and shape from a single image[C]// Proceedings of the 2016 European Conference on Computer Vision, LNCS 9909. Cham: Springer, 2016: 561-578. |
10 | HUANG Y H, BOGO F, LASSNER C, et al. Towards accurate marker-less human shape and pose estimation over time[C]// Proceedings of the 2017 International Conference on 3D Vision. Piscataway: IEEE, 2017: 421-430. 10.1109/3dv.2017.00055 |
11 | LASSNER C, ROMERO J, KIEFEL M, et al. Unite the people: closing the loop between 3D and 2D human representations[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 4704-4713. 10.1109/cvpr.2017.500 |
12 | ZANFIR A, MARINOIU E, SMINCHISESCU C. Monocular 3D pose and shape estimation of multiple people in natural scenes —the importance of multiple scene constraints[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 2148-2157. 10.1109/cvpr.2018.00229 |
13 | DIBRA E, JAIN H, ÖZTIRELI C, et al. HS-Nets: estimating human body shape from silhouettes with convolutional neural networks[C]// Proceedings of the 4th International Conference on 3D Vision. Piscataway: IEEE, 2016: 108-117. 10.1109/3dv.2016.19 |
14 | TAN J K V, BUDVYTIS I, CIPOLLA R. Indirect deep structured learning for 3D human body shape and pose prediction[C]// Proceedings of the 2017 British Machine Vision Conference Durham: BMVA Press, 2017: No.722. 10.5244/c.31.15 |
15 | TUNG H Y F, TUNG H W, YUMER E, et al. Self-supervised learning of motion capture[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook, NY: Curran Associates Inc., 2017:5242-5252. |
16 | KANAZAWA A, BLACK M J, JACOBS D W, et al. End-to-end recovery of human shape and pose[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 7122-7131. 10.1109/cvpr.2018.00744 |
17 | KOLOTOUROS N, PAVLAKOS G, BLACK M J, et al. Learning to reconstruct 3D human pose and shape via model-fitting in the loop[C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 2252-2261. 10.1109/iccv.2019.00234 |
18 | ZHANG T S, HUANG B Z, WANG Y G. Object-occluded human shape and pose estimation from a single color image[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 7374-7383. 10.1109/cvpr42600.2020.00740 |
19 | LI Z G, OSKARSSON M, HEYDEN A. 3D human pose and shape estimation through collaborative learning and multi-view model-fitting[C]// Proceedings of the 2021 IEEE/CVF Winter Conference on Applications of Computer Vision. Piscataway: IEEE, 2021: 1887-1896. 10.1109/wacv48630.2021.00193 |
20 | KOLOTOUROS N, PAVLAKOS G, DANIILIDIS K. Convolutional mesh regression for single-image human shape reconstruction[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 4496-4505. 10.1109/cvpr.2019.00463 |
21 | XIE H Y, ZHONG Y Q, YU Z C, et al. Non-parametric anthropometric graph convolutional network for virtual mannequin reconstruction[J]. IEEE Access, 2020, 8: 3539-3550. 10.1109/access.2019.2962833 |
22 | ZHANG S Z, XIAO N F. Detailed 3D human body reconstruction from a single image based on mesh deformation[J]. IEEE Access, 2021, 9: 8595-8603. 10.1109/access.2021.3049548 |
23 | CHENG K L, TONG R F, TANG M, et al. Parametric human body reconstruction based on sparse key points[J]. IEEE Transactions on Visualization and Computer Graphics, 2016, 22(11): 2467-2479. 10.1109/tvcg.2015.2511751 |
24 | BOGO F, ROMERO J, LOPER M, et al. FAUST: dataset and evaluation for 3D mesh registration[C]// Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2014: 3794-3801. 10.1109/cvpr.2014.491 |
25 | von MARCARD T, HENSCHEL R, BLACK M J, et al. Recovering accurate 3D human pose in the wild using IMUs and a moving camera[C]// Proceedings of the 2018 European Conference on Computer Vision, LNCS 11214. Cham: Springer, 2018: 614-631. |
26 | LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: common objects in context[C]// Proceedings of the 2014 European Conference on Computer Vision, LNCS 8693. Cham: Springer, 2014: 740-755. |
27 | ANDRILUKA M, PISHCHULIN L, GEHLER P, et al. 2D human pose estimation: new benchmark and state of the art analysis[C]// Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2014: 3686-3693. 10.1109/cvpr.2014.471 |
28 | JOHNSON S, EVERINGHAM M. Clustered pose and nonlinear appearance models for human pose estimation[C]// Proceedings of the 2010 British Machine Vision Conference. Durham: BMVA Press, 2010: No.12. 10.5244/c.24.12 |
29 | MEHTA D, RHODIN H, CASAS D, et al. Monocular 3D human pose estimation in the wild using improved CNN supervision[C]// Proceedings of the 2017 International Conference on 3D Vision. Piscataway: IEEE, 2017: 506-516. 10.1109/3dv.2017.00064 |
[1] | 庞川林, 唐睿, 张睿智, 刘川, 刘佳, 岳士博. D2D通信系统中基于图卷积网络的分布式功率控制算法[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2855-2862. |
[2] | 薛桂香, 王辉, 周卫峰, 刘瑜, 李岩. 基于知识图谱和时空扩散图卷积网络的港口交通流量预测[J]. 《计算机应用》唯一官方网站, 2024, 44(9): 2952-2957. |
[3] | 刘禹含, 吉根林, 张红苹. 基于骨架图与混合注意力的视频行人异常检测方法[J]. 《计算机应用》唯一官方网站, 2024, 44(8): 2551-2557. |
[4] | 李欢欢, 黄添强, 丁雪梅, 罗海峰, 黄丽清. 基于多尺度时空图卷积网络的交通出行需求预测[J]. 《计算机应用》唯一官方网站, 2024, 44(7): 2065-2072. |
[5] | 黎施彬, 龚俊, 汤圣君. 基于Graph Transformer的半监督异配图表示学习模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1816-1823. |
[6] | 吕锡婷, 赵敬华, 荣海迎, 赵嘉乐. 基于Transformer和关系图卷积网络的信息传播预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(6): 1760-1766. |
[7] | 高龙涛, 李娜娜. 基于方面感知注意力增强的方面情感三元组抽取[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1049-1057. |
[8] | 杨先凤, 汤依磊, 李自强. 基于交替注意力机制和图卷积网络的方面级情感分析模型[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1058-1064. |
[9] | 刘扬, 刘蓉, 方可, 张心月, 王光旭. 基于帧间跨越光流的视频超分辨率重建网络[J]. 《计算机应用》唯一官方网站, 2024, 44(4): 1277-1284. |
[10] | 王楷天, 叶青, 程春雷. 基于异构图表示的中医电子病历分类方法[J]. 《计算机应用》唯一官方网站, 2024, 44(2): 411-417. |
[11] | 吴祖成, 吴小俊, 徐天阳. 基于模态内细粒度特征关系提取的图像文本检索模型[J]. 《计算机应用》唯一官方网站, 2024, 44(12): 3776-3783. |
[12] | 梁睿衍, 杨慧. 基于RPEpose和XJ-GCN的轻量级跌倒检测算法框架[J]. 《计算机应用》唯一官方网站, 2024, 44(11): 3639-3646. |
[13] | 王利琴, 张特, 许智宏, 董永峰, 杨国伟. 融合实体语义及结构信息的知识图谱推理[J]. 《计算机应用》唯一官方网站, 2024, 44(11): 3371-3378. |
[14] | 胡新荣, 陈静雪, 黄子键, 王帮超, 姚迅, 刘军平, 朱强, 杨捷. 基于图卷积网络的掩码数据增强[J]. 《计算机应用》唯一官方网站, 2024, 44(11): 3335-3344. |
[15] | 项能强, 朱小飞, 高肇泽. 原型感知双通道图卷积神经网络的信息传播预测模型[J]. 《计算机应用》唯一官方网站, 2024, 44(10): 3260-3266. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||