Journal of Computer Applications

    Next Articles

Lightweight human pose estimation based on fused feature state space model#br#
#br#

LI Zhuoran1, LI Hua1, WANG Tong2, JIANG Chaozhe2   

  1. 1. School of Information Science and Technology, Southwest Jiaotong University 2. School of Transportation and Logistics, Southwest Jiaotong University
  • Received:2024-09-23 Revised:2024-11-25 Online:2024-12-20 Published:2024-12-20
  • About author:LI Zhuoran, born in 2000, M. S. candidate. His research interests include computer vision, human pose estimation. LI Hua, born in 1979, Ph. D., lecturer. His research interests include computer vision, deep learning. WANG Tong, born in 1995, M. S. candidate. His research interests include machine learning, image processing. JIANG Chaozhe, born in 1968, Ph. D., professor. His research interests include big data and artificial intelligence, advanced manufacturing.
  • Supported by:
    Jade Kirin Science and Innovation Fund Project (2019H010362)

基于融合特征状态空间模型的轻量化人体姿态估计

李卓然1,李华1,王桐2,蒋朝哲2   

  1. 1.西南交通大学 信息科学与技术学院 2.西南交通大学 交通运输与物流学院
  • 通讯作者: 李华
  • 作者简介:李卓然(2000—),男,四川达州人,硕士研究生,CCF学生会员,主要研究方向:计算机视觉、人体姿态估计;李华(1979—),男,四川成都人,讲师,博士,主要研究方向:计算机视觉、深度学习;王桐(1995—),男,河北石家庄人,硕士研究生,主要研究方向:机器学习、图像处理;蒋朝哲(1968—),男,四川达州人,教授,博士,主要研究方向:大数据、人工智能、先进制造。
  • 基金资助:
    玉麒麟科创基金资助项目(2019H010362)

Abstract: In the field of human pose estimation, heatmap-based methods suffer from the problems of quantization error, high computational complexity, and the need to post-process the heatmap as well. To address the above problems, a lightweight human pose estimation model based on fused feature state space model was proposed, namely Lite-SimCC, using the SimCC method of coordinate regression as a baseline. ShuffleNetV2 was adopted as the backbone network to replace the original High-Resolution Net (HRNet), which simplified to a single-branch form and realizes the model lightweighting. To mitigate the loss of accuracy, a large kernel convolution was introduced to extract global feature information. A state space model with fused features was further designed to handle both local and full long sequence features to enhance the characterization of key points. Finally, a soft-label based loss function was proposed to replace the traditional one-hot loss calculation. Experimental results show that compared with the baseline method SimCC, Lite-SimCC has the parameters decreased by 87.1%, and the Average Precision (AP) is improved by 1% to 71.8% on the COCO2017 test set. And the experiments on the MPII dataset also prove that Lite-SimCC also effectively reduces the number of parameters of the model based on the guaranteed detection accuracy.

Key words: human pose estimation, coordinate regression, state space model, lightweight, soft-label

摘要: 人体姿态估计领域中,基于热图的方法存在量化误差、计算复杂度高以及还需对热图进行后处理等问题。针对上述问题,以坐标回归的SimCC方法为基线,提出一种基于融合特征状态空间模型的轻量化人体姿态估计方法Lite-SimCC,采用ShuffleNetV2作为骨干网络,替代原有的HRNet(High-Resolution Net),简化为单分支形式,并实现模型轻量化。为了缓解准确率的损失,引入大核卷积提取全局特征信息。进一步设计了融合特征的状态空间模型,用于处理局部和全局长序列特征,增强关键点的表征能力。最后提出一种基于软标签的损失函数,替代传统的one-hot损失计算方式。实验结果表明,与基线方法SimCC相比,Lite-SimCC的参数量少了87.1%,在COCO2017测试集上平均精度(AP)提升了1%达到71.8%,在MPII数据集上的实验也验证了Lite-SimCC在保证检测精度的基础上还有效降低了模型的参数量。

关键词: 人体姿态估计, 坐标回归, 状态空间模型, 轻量化, 软标签

CLC Number: