Journal of Computer Applications ›› 2026, Vol. 46 ›› Issue (6): 1973-1980.DOI: 10.11772/j.issn.1001-9081.2025060700

• Multimedia computing and computer simulation • Previous Articles    

Lightweight human pose estimation network based on redundant feature suppression

Chao LYU(), Geyao MA   

  1. College of Electronic Information Engineering,Changchun University of Science and Technology,Changchun Jilin 130022,China
  • Received:2025-06-24 Revised:2025-09-05 Accepted:2025-09-11 Online:2025-09-17 Published:2026-06-10
  • Contact: Chao LYU
  • About author:MA Geyao, born in 2001, M. S. candidate. His research interests include pattern recognition, intelligent system.
    First author contact:LYU Chao, born in 1989, Ph. D., associate professor. His research interests include pattern recognition, intelligent system.
  • Supported by:
    National Key Research and Development Program of China(2024YFC2207103);Natural Science Foundation of Jilin Province(20240101345JC)

基于冗余特征抑制的轻量级人体姿态估计网络

吕超(), 马歌谣   

  1. 长春理工大学 电子信息工程学院,长春 130022
  • 通讯作者: 吕超
  • 作者简介:马歌谣(2001—),男(回族),辽宁鞍山人,硕士研究生,主要研究方向:模式识别、智能系统。
    第一联系人:吕超(1989—),男,吉林长春人,副教授,博士,主要研究方向:模式识别、智能系统
  • 基金资助:
    国家重点研发计划项目(2024YFC2207103);吉林省自然科学基金资助项目(20240101345JC)

Abstract:

A lightweight Human Pose Estimation (HPE) network based on redundant feature suppression was proposed to address the difficulty of balancing computational efficiency and localization accuracy of the existing HPE networks in complex scenarios. It was named LE-SHNet (Lightweight Enhanced Stacked Hourglass Network). Firstly, the Multiple Separated Hourglass Module (MSHM) was designed to employ heterogeneous convolution branches for differential modeling of the features of large joints and distal limbs, while suppressing redundant computations. Then, the Shuffle Efficient Channel Attention (SECA) was integrated between MSHMs, so as to combine channel shuffling and adaptive kernel convolution to enhance hierarchical joint correlations with zero additional parameters. Finally, the Spatial and Channel Perception Module (SCPM) was constructed in non-MSHMs to strengthen perception ability of key areas by spatial-channel reconstruction and Triplet Attention (TA) mechanism. Experimental results show that LE-SHNet achieves Average Precision (AP) of 88.7% on MPII (Max Planck Institute for Informatics) and 71.3% on COCO2017 (Common Objects in COntext 2017), while reduces the number of parameters by 49.3%, reduces the computational cost by 28.2%, and increases the Average Precision (AP) by 1.0 percentage points compared with the baseline network — Two Stacked Hourglass Network (2-SHNet); compared with the lightweight HPE networks EL-HRNet (Efficient and Lightweight High-Resolution Network) and MobileMultiPose (Mobile-friendly and Multi-feature aggregation Pose estimation), LE-SHNet achieves AP improvements of 1.0 and 0.8 percentage points, respectively, while reducing the number of parameters by 32.0% and 26.7%, respectively. It can be seen that LE-SHNet maintains lightweight properties while improving keypoint localization accuracy, so that it has potential application values for real-time deployment on edge devices in scenarios such as intelligent monitoring, human-computer interaction, and sports rehabilitation.

Key words: Human Pose Estimation (HPE), Stacked Hourglass Network (SHNet), spatial-channel reconstruction, Triplet Attention (TA), redundant feature suppression, multi-scale feature fusion

摘要:

针对现有人体姿态估计(HPE)网络在复杂场景下难以兼顾计算效率与定位精度的问题,提出一种基于冗余特征抑制的轻量级HPE网络,命名为LE-SHNet (Lightweight Enhanced Stacked Hourglass Network)。首先,设计多重分离沙漏模块(MSHM),通过异构卷积分支差异化建模大关节与末端肢体特征,并有效抑制冗余计算;其次,在MSHM 之间引入混洗高效通道注意力(SECA),融合通道混洗与自适应核卷积,以零参数量强化跨层级关节点关联;最后,在非MSHM中构建空间通道感知模块(SCPM),利用空间通道重构与三重注意力(TA)机制增强关键区域的感知能力。实验结果表明,LE-SHNet在MPII (Max Planck Institute for Informatics)和COCO2017 (Common Objects in COntext 2017)数据集上平均精确度(AP)分别达到88.7%和71.3%,相较于基线网络——二叠沙漏网络(2-SHNet)在参数量上减少了49.3%,计算量降低了28.2%,平均精确率(AP)提升了1.0个百分点;相较于轻量级HPE网络EL-HRNet (Efficient and Lightweight High-Resolution Network)和MobileMultiPose (Mobile-friendly and Multi-feature aggregation Pose estimation),LE-SHNet的AP分别提升了1.0和0.8个百分点,同时参数量分别减少了32.0%和26.7%。可见,LE-SHNet在保持轻量化的同时提升了关键点的定位精度,具有在边缘设备实时部署中的潜在应用价值,可广泛用于智能监控、人机交互及运动康复等场景。

关键词: 人体姿态估计, 堆叠沙漏网络, 空间通道重构, 三重注意力, 冗余特征抑制, 多尺度特征融合

CLC Number: