Human dimension attention regressor method for monocular occluded human mesh recovery

doi:10.11772/j.issn.1001-9081.2025060705

Journal of Computer Applications ›› 2026, Vol. 46 ›› Issue (6): 1981-1988.DOI: 10.11772/j.issn.1001-9081.2025060705

• Multimedia computing and computer simulation • Previous Articles

Human dimension attention regressor method for monocular occluded human mesh recovery

Menghua WANG¹^,², Yukun DONG¹^,²(), Long CHENG¹^,², Junqi SUN¹^,²

^1.Qingdao Institute of Software，College of Computer Science and Technology，China University of Petroleum （East China），Qingdao Shandong 266580，China
^2.Shandong Key Laboratory of Intelligent Oil and Gas Industrial Software （China University of Petroleum （East China）），Qingdao Shandong 266580，China

Received:2025-06-24 Revised:2025-08-31 Accepted:2025-09-02 Online:2025-09-09 Published:2026-06-10
Contact: Yukun DONG
About author:WANG Menghua， born in 2001， M. S. candidate. His research interests include machine vision， human mesh recovery.
CHENG Long， born in 2000， M. S. candidate. His research interests include deep learning， machine vision.
SUN Junqi， born in 2002， M. S. candidate. His research interests include deep learning， machine vision.
First author contact:DONG Yukun， born in 1981， Ph. D.， associate professor. His research interests include deep learning， software testing.
Supported by:
Shandong Provincial National Science Foundation(ZR2024MF129)

用于单目遮挡人体网格恢复的人体尺寸注意力回归方法

王梦华¹^,², 董玉坤¹^,²(), 程龙¹^,², 孙骏骐¹^,²

^1.中国石油大学（华东）青岛软件学院、计算机科学与技术学院，山东青岛 266580
^2.山东省智能油气工业软件重点实验室（中国石油大学（华东）），山东青岛 266580

通讯作者: 董玉坤
作者简介:王梦华（2001—），男，河南郑州人，硕士研究生，CCF会员，主要研究方向：机器视觉、人体网格重建
程龙（2000—），男，山东烟台人，硕士研究生，主要研究方向：深度学习、机器视觉
孙骏骐（2002—），男，安徽淮南人，硕士研究生，主要研究方向：深度学习、机器视觉。
第一联系人：董玉坤（1981—），男，山东济宁人，副教授，博士，CCF会员，主要研究方向：深度学习、软件测试
基金资助:
山东省自然科学基金资助项目(ZR2024MF129);山东省自然科学基金资助项目(ZR2023MF041)

Abstract

Abstract:

In real-world scenarios， human images are often occluded by clothing， self-posture， and environmental objects， leading to insufficient visible information， so that the existing human reconstruction methods tend to degrade to mean models in shape modelling， failing to recover real individual characteristics faithfully. To address this issue， a Human Dimension Attention Regressor （HDAR） method for monocular occluded human mesh recovery was proposed. Firstly， human dimensions in the visible region were used to infer the dimensions of occluded parts. Secondly， a hierarchical proportion constraint was introduced， in which first-level constraints were applied to adjacent body parts and second-level constraints were applied to distant body parts， thereby ensuring that the regressed shapes conform to human structural characteristics. Finally， Two-Dimensional （2D） joint information was integrated with the body dimension information for iterative optimization， so as to improve pose estimation accuracy. Experimental results on the 3DPW （Three-Dimensional （3D） Poses in the Wild） dataset show that， the proposed method achieves a Per Vertex Error （PVE） of 65.2 mm， which is 10.7 mm lower than that of Multi-HMR （Multi-person whole-body Human Mesh Recovery） under occlusion conditions， corresponding to a 14.1% error reduction. Visualization experimental results demonstrate that the proposed method improves the reconstruction accuracy of human shape and pose in complex occlusion scenarios effectively.

Key words: monocular occluded human reconstruction, human dimension, human measurement, occluded human, hierarchical proportion constraint, iterative optimization

摘要：

在真实场景中，人体图像常受服装、自身姿态及环境物体的遮挡，导致可见信息不足，使现有人体重建方法在形状建模上易退化为均值模型，难以真实还原个体真实特征。针对这一问题，提出一种用于单目遮挡人体网格恢复的人体尺寸注意力回归方法（HDAR）。首先，利用可见区域的人体尺寸推理被遮挡部分的尺寸信息；其次，引入人体维度的分级比例约束，在邻近部位间建立一级约束，在较远部位间建立二级约束，使回归形状符合人体结构特征；最后，结合二维（2D）关节点信息与人体尺寸进行迭代优化，提升姿态估计精度。在3DPW（Three-Dimensional Poses in the Wild）数据集上的实验结果表明，该方法的逐顶点误差（PVE）为65.2 mm，相较于Multi-HMR（Multi-person whole-body Human Mesh Recovery）在遮挡状态下减小了10.7 mm，即减小了14.1%的误差。可视化实验的结果表明，所提方法能够在复杂遮挡场景下有效提升人体形状与姿态的重建精度。

关键词: 单目遮挡人体重建, 人体尺寸, 人体测量, 遮挡人体, 分级比例约束, 迭代优化

CLC Number:

TP391.41

Menghua WANG, Yukun DONG, Long CHENG, Junqi SUN. Human dimension attention regressor method for monocular occluded human mesh recovery[J]. Journal of Computer Applications, 2026, 46(6): 1981-1988.

王梦华, 董玉坤, 程龙, 孙骏骐. 用于单目遮挡人体网格恢复的人体尺寸注意力回归方法[J]. 《计算机应用》唯一官方网站, 2026, 46(6): 1981-1988.

Figures/Tables 9

Fig.1 Segmentation graph of human dimensions and body parts

Fig.2 Computation of curved human dimension

Fig.3 Some human dimension data aged 17~70

Fig.4 Overall structure of HDAR

Fig.5 HDAR’s prediction results on some occluded human images

Fig.6 Qualitative comparison of HDAR and other methods

Tab.1 Quantitative comparison of different methods on two datasets

数据集	方法	MPJPE	PA-MPJPE	PVE
3DPW	SPIN	96.9	59.2	135.1
	HMR	130.0	76.7	－
	PARE	82.9	52.3	99.7
	PyMAF	92.8	58.9	110.1
	Multi-HMR	61.4	41.7	75.9
	本文方法	54.7	34.5	65.2
3DPW-OCC	SPIN	95.6	60.8	121.6
	HMR-EFT	94.4	60.9	111.3
	PARE	90.5	56.6	107.9
	本文方法	78.6	49.4	98.7

Tab.2 Performance evaluation of different components of objective function

组成	PA-MPJPE	PVE
L	34.5	65.2
Without L_D	47.1	84.4
Without L_global，L_local	43.6	80.5
Without L_local，L_global，L_D	52.3	99.7

Fig.7 Change of metrics during iteration

References 32

[1]	LOPER M， MAHMOOD N， ROMERO J， et al. SMPL： a skinned multi-person linear model［M］// Seminal graphics papers： pushing the boundaries， Volume 2. New York： ACM， 2023： 851-866.
[2]	KOLOTOUROS N， PAVLAKOS G， BLACK M， et al. Learning to reconstruct 3D human pose and shape via model-fitting in the loop［C］// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2019： 2252-2261.
[3]	JOO H， NEVEROVA N， VEDALDI A. Exemplar fine-tuning for 3D human model fitting towards in-the-wild 3D human pose estimation［C］// Proceedings of the 2021 IEEE/CVF International Conference on 3D Vision. Piscataway： IEEE， 2021： 42-52.
[4]	ZHANG H， TIAN Y， ZHOU X， et al. PyMAF： 3D human pose and shape regression with pyramidal mesh alignment feedback loop［C］// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2021： 11426-11436.
[5]	KOCABAS M， HUANG C H P， HILLIGES O， et al. PARE： part attention regressor for 3D human body estimation［C］// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2021： 11107-11117.
[6]	LI Z， LIU J， ZHANG Z， et al. CLIFF： carrying location information in full frames into human pose and shape estimation［C］// Proceedings of the 2022 European Conference on Computer Vision， LNCS 13665. Cham： Springer， 2022： 590-606.
[7]	BARADEL F， ARMANDO M， GALAAOUI S， et al. Multi-HMR： multi-person whole-body human mesh recovery in a single shot［C］// Proceedings of the 2024 European Conference on Computer Vision， LNCS 15081. Cham： Springer， 2025： 202-218.
[8]	VON MARCARD T， HENSCHEL R， BLACK M J， et al. Recovering accurate 3D human pose in the wild using IMUs and a moving camera［C］// Proceedings of the 2018 European Conference on Computer Vision， LNCS 11214. Cham： Springer， 2018： 614-631.
[9]	BOGO F， KANAZAWA A， LASSNER C， et al. Keep it SMPL： automatic estimation of 3D human pose and shape from a single image［C］// Proceedings of the 2016 European Conference on Computer Vision， LNCS 9909. Cham： Springer， 2016： 561-578.
[10]	KANAZAWA A， BLACK M J， JACOBS D W， et al. End-to-end recovery of human shape and pose［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 7122-7131.
[11]	GOEL S， PAVLAKOS G， RAJASEGARAN J， et al. Humans in 4D： reconstructing and tracking humans with Transformers［C］// Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision. Piscataway： IEEE， 2023： 14737-14748.
[12]	SENGUPTA A， BUDVYTIS I， CIPOLLA R. Synthetic training for accurate 3D human pose and shape estimation in the wild［C］// Proceedings of the 2020 British Machine Vision Conference. Durham： BMVA Press， 2020： No.81.
[13]	SENGUPTA A， BUDVYTIS I， CIPOLLA R. Probabilistic 3D human shape and pose estimation from multiple unconstrained images in the wild［C］// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2021： 16089-16099.
[14]	ZANFIR A， BAZAVAN E G， XU H， et al. Weakly supervised 3D human pose and shape reconstruction with normalizing flows［C］// Proceedings of the 2020 European Conference on Computer Vision， LNCS 12351. Cham： Springer， 2020： 465-481.
[15]	CHOI H， MOON G， LEE K M. Pose2Mesh： graph convolutional network for 3D human pose and mesh recovery from a 2D human pose［C］// Proceedings of the 2020 European Conference on Computer Vision， LNCS 12352. Cham： Springer， 2020： 769-787.
[16]	GÜLER R A， KOKKINOS I. HoloPose： holistic 3D human reconstruction in-the-wild［C］// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2019： 10876-10886.
[17]	CHOI H， MOON G， PARK J， et al. Learning to estimate robust 3D human mesh from in-the-wild crowded scenes［C］// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2022： 1465-1474.
[18]	KHIRODKAR R， TRIPATHI S， KITANI K. Occluded human mesh recovery［C］// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2022： 1705-1715.
[19]	XUAN H， ZHANG J， LAI Y K， et al. MH‐HMR： human mesh recovery from monocular images via multi‐hypothesis learning［J］. CAAI Transactions on Intelligence Technology， 2024， 9（5）： 1263-1274.
[20]	SALEEM M U， PINYOANUNTAPONG E， WANG P， et al. GenHMR： generative human mesh recovery［C］// Proceedings of the 39th AAAI Conference on Artificial Intelligence. Palo Alto： AAAI Press， 2025： 6749-6757.
[21]	ZHU Y， LI A， TANG Y， et al. DPMesh： exploiting diffusion prior for occluded human mesh recovery［C］// Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2024： 1101-1110.
[22]	SONG T， ZHANG R， DONG Y， et al. MMDA： disease analysis model based on anthropometric measurement［C］// Proceedings of the 2021 IEEE International Conference on Bioinformatics and Biomedicine. Piscataway： IEEE， 2021： 3092-3098.
[23]	中国国家标准化管理委员会. 服装用人体测量的基准点获取方法：［S］. 北京：中国标准出版社， 2019.
	National Standardization Administration of the People’s Republic of China. Acquisition method of datum points for clothing anthropometry：［S］. Beijing： Standards Press of China， 2019.
[24]	中国国家标准化管理委员会. 服装用人体测量的尺寸定义与方法：［S］. 北京：中国标准出版社， 2017.
	National Standardization Administration of the People’s Republic of China. Anthropometric definitions and methods for garment：［S］. Beijing： Standards Press of China， 2017.
[25]	中国国家标准化管理委员会. 中国成年人人体尺寸：［S］. 北京：中国标准出版社， 2023.
	National Standardization Administration of the People’s Republic of China. Human dimensions of Chinese adults：［S］. Beijing： Standards Press of China， 2023.
[26]	HE K， ZHANG X， REN S， et al. Deep residual learning for image recognition［C］// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2016： 770-778.
[27]	KATO H， USHIKU Y， HARADA T. Neural 3D mesh renderer［C］// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2018： 3907-3916.
[28]	IONESCU C， PAPAVA D， OLARU V， et al. Human3.6M： large scale datasets and predictive methods for 3D human sensing in natural environments ［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence， 2013， 36（7）： 1325-1339.
[29]	MEHTA D， RHODIN H， CASAS D， et al. Monocular 3D human pose estimation in the wild using improved CNN supervision［C］//Proceedings of the 2017 International Conference on 3D Vision. Piscataway： IEEE， 2017： 506-516.
[30]	LIFSHITZ I， FETAYA E， ULLMAN S. Human pose estimation using deep consensus voting［C］// Proceedings of the 2016 European Conference on Computer Vision. Cham： Springer， 2016： 246-260.
[31]	LIN T Y， MAIRE M， BELONGIE S， et al. Microsoft COCO： common objects in context ［C］// Proceedings of the 2014 European Conference on Computer Vision. Cham： Springer， 2014： 740-755.
[32]	ANDRILUKA M， PISHCHULIN L， GEHLER P， et al. 2D human pose estimation： new benchmark and state of the art analysis ［C］//Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway： IEEE， 2014： 3686-3693.

Human dimension attention regressor method for monocular occluded human mesh recovery

用于单目遮挡人体网格恢复的人体尺寸注意力回归方法

RichHTML

PDF

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 9

References 32

Related Articles 5

Recommended Articles

Metrics

[1]	Haitao TANG, Hongjun WANG, Tianrui LI. Discriminative multidimensional scaling for feature learning [J]. Journal of Computer Applications, 2023, 43(5): 1323-1329.
[2]	YANG Hongyu, WANG Fengyan. Network intrusion detection model based on improved convolutional neural network [J]. Journal of Computer Applications, 2019, 39(9): 2604-2610.
[3]	CAO Dawei, HE Chaobo, CHEN Qimai, LIU Hai. Short text clustering algorithm based on weighted kernel nonnegative matrix factorization [J]. Journal of Computer Applications, 2018, 38(8): 2180-2184.
[4]	LI Zhou, CUI Chen. Observation matrix optimization algorithm in compressive sensing based on singular value decomposition [J]. Journal of Computer Applications, 2018, 38(2): 568-572.
[5]	. Image registration based on piecewise linear group parameters [J]. Journal of Computer Applications, 2007, 27(4): 976-978.