《计算机应用》唯一官方网站 ›› 2025, Vol. 45 ›› Issue (10): 3179-3186.DOI: 10.11772/j.issn.1001-9081.2024091351

• 人工智能 • 上一篇    

基于融合特征状态空间模型的轻量化人体姿态估计

李卓然1, 李华1(), 王桐2, 蒋朝哲2   

  1. 1.西南交通大学 信息科学与技术学院,成都 611756
    2.西南交通大学 交通运输与物流学院,成都 611756
  • 收稿日期:2024-09-23 修回日期:2024-11-27 接受日期:2024-12-02 发布日期:2024-12-20 出版日期:2025-10-10
  • 通讯作者: 李华
  • 作者简介:李卓然(2000—),男,四川达州人,硕士研究生,CCF会员,主要研究方向:计算机视觉、人体姿态估计
    李华(1979—),男,四川成都人,讲师,博士,主要研究方向:计算机视觉、深度学习 Email:hli8@swjtu.edu.cn
    王桐(1995—),男,河北石家庄人,硕士研究生,主要研究方向:机器学习、图像处理
    蒋朝哲(1968—),男,四川达州人,教授,博士,主要研究方向:大数据、人工智能、先进制造。
  • 基金资助:
    玉麒麟科创基金资助项目(2019H010362)

Lightweight human pose estimation based on merge state space model

Zhuoran LI1, Hua LI1(), Tong WANG2, Chaozhe JIANG2   

  1. 1.School of Information Science and Technology,Southwest Jiaotong University,Chengdu Sichuan 611756,China
    2.School of Transportation and Logistics,Southwest Jiaotong University,Chengdu Sichuan 611756,China
  • Received:2024-09-23 Revised:2024-11-27 Accepted:2024-12-02 Online:2024-12-20 Published:2025-10-10
  • Contact: Hua LI
  • About author:LI Zhuoran, born in 2000, M. S. candidate. His research interests include computer vision, human pose estimation.
    LI Hua, born in 1979, Ph. D., lecturer. His research interests include computer vision, deep learning.
    WANG Tong, born in 1995, M. S. candidate. His research interests include machine learning, image processing.
    JIANG Chaozhe, born in 1968, Ph. D., professor. His research interests include big data, artificial intelligence, advanced manufacturing.
  • Supported by:
    Jade Kirin Science and Innovation Fund(2019H010362)

摘要:

在人体姿态估计(HPE)领域中,基于热图的方法存在量化误差大、计算复杂度高和需要对热图进行后处理等问题。针对上述问题,以坐标回归的SimCC方法为基线,提出一种基于融合特征的状态空间模型(MSSM)的轻量化HPE方法Lite-SimCC。首先,采用ShuffleNet V2作为骨干网络,替代原有的HRNet(High-Resolution Net),简化为单分支形式结构,并实现模型的轻量化;其次,为了降低精确率的损失,引入大核卷积提取全局特征信息;然后,设计MSSM,用于处理局部和全局长序列特征,增强关键点的表征能力;最后,提出一种基于软标签的损失函数,替代传统的one-hot损失计算方式。实验结果表明,与基线方法SimCC相比,Lite-SimCC的参数量少了87.1%,在COCO2017测试集上的平均精确率(AP)提升了1.4%,在MPII数据集上验证了Lite-SimCC在保证检测精确率的基础上有效降低了模型的参数量。

关键词: 人体姿态估计, 坐标回归, 状态空间模型, 轻量化, 软标签

Abstract:

In the field of Human Pose Estimation (HPE), heatmap-based methods suffer from the problems of big quantization error, high computational complexity, and the need to post-process the heatmap. To address the above issues, with SimCC method of coordinate regression as a baseline, a lightweight HPE model based on Merge State Space Model (MSSM) was proposed, namely Lite-SimCC. Firstly, ShuffleNet V2 was adopted as the backbone network to replace the original HRNet (High-Resolution Net), which simplified to a structure of single-branch form and realized lightweight model. Secondly, to reduce the loss of precision, a large kernel convolution was introduced to extract global feature information. Thirdly, an MSSM was further designed to handle both local and full long sequence features, so as to enhance representational ability of the key points. Finally, a soft-label based loss function was proposed to replace the traditional one-hot loss calculation method. Experimental results show that compared with the baseline method SimCC, Lite-SimCC has the parameters decreased by 87.1%, and the Average Precision (AP) improved by 1.4% on COCO2017 test set, and it is proved on MPII dataset that Lite-SimCC reduces parameters of the model effectively while guaranteeing detection precision.

Key words: Human Pose Estimation (HPE), coordinate regression, State Space Model (SSM), lightweight, soft-label

中图分类号: