《计算机应用》唯一官方网站 ›› 2026, Vol. 46 ›› Issue (6): 1973-1980.DOI: 10.11772/j.issn.1001-9081.2025060700
收稿日期:2025-06-24
修回日期:2025-09-05
接受日期:2025-09-11
发布日期:2025-09-17
出版日期:2026-06-10
通讯作者:
吕超
作者简介:马歌谣(2001—),男(回族),辽宁鞍山人,硕士研究生,主要研究方向:模式识别、智能系统。基金资助:Received:2025-06-24
Revised:2025-09-05
Accepted:2025-09-11
Online:2025-09-17
Published:2026-06-10
Contact:
Chao LYU
About author:MA Geyao, born in 2001, M. S. candidate. His research interests include pattern recognition, intelligent system.Supported by:摘要:
针对现有人体姿态估计(HPE)网络在复杂场景下难以兼顾计算效率与定位精度的问题,提出一种基于冗余特征抑制的轻量级HPE网络,命名为LE-SHNet (Lightweight Enhanced Stacked Hourglass Network)。首先,设计多重分离沙漏模块(MSHM),通过异构卷积分支差异化建模大关节与末端肢体特征,并有效抑制冗余计算;其次,在MSHM 之间引入混洗高效通道注意力(SECA),融合通道混洗与自适应核卷积,以零参数量强化跨层级关节点关联;最后,在非MSHM中构建空间通道感知模块(SCPM),利用空间通道重构与三重注意力(TA)机制增强关键区域的感知能力。实验结果表明,LE-SHNet在MPII (Max Planck Institute for Informatics)和COCO2017 (Common Objects in COntext 2017)数据集上平均精确度(AP)分别达到88.7%和71.3%,相较于基线网络——二叠沙漏网络(2-SHNet)在参数量上减少了49.3%,计算量降低了28.2%,平均精确率(AP)提升了1.0个百分点;相较于轻量级HPE网络EL-HRNet (Efficient and Lightweight High-Resolution Network)和MobileMultiPose (Mobile-friendly and Multi-feature aggregation Pose estimation),LE-SHNet的AP分别提升了1.0和0.8个百分点,同时参数量分别减少了32.0%和26.7%。可见,LE-SHNet在保持轻量化的同时提升了关键点的定位精度,具有在边缘设备实时部署中的潜在应用价值,可广泛用于智能监控、人机交互及运动康复等场景。
中图分类号:
吕超, 马歌谣. 基于冗余特征抑制的轻量级人体姿态估计网络[J]. 计算机应用, 2026, 46(6): 1973-1980.
Chao LYU, Geyao MA. Lightweight human pose estimation network based on redundant feature suppression[J]. Journal of Computer Applications, 2026, 46(6): 1973-1980.
| 改进模块 | 原始网络中的结构 | 原始网络的时间复杂度表达式 | 改进网络的时间复杂度表达式 |
|---|---|---|---|
| MSHM | 标准瓶颈残差模块组成的沙漏模块 | ||
| SECA | 无注意力 | 0 | |
| SCPM | 标准瓶颈残差模块 |
表1 时间复杂度对比
Tab. 1 Time complexity comparison
| 改进模块 | 原始网络中的结构 | 原始网络的时间复杂度表达式 | 改进网络的时间复杂度表达式 |
|---|---|---|---|
| MSHM | 标准瓶颈残差模块组成的沙漏模块 | ||
| SECA | 无注意力 | 0 | |
| SCPM | 标准瓶颈残差模块 |
| MSHM | SECA | SCPM | 参数量/106 | FLOPs/109 | 不同预测关键点的PCKh@0.5/% | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 头部 | 肩部 | 肘部 | 手腕 | 臀部 | 膝盖 | 脚踝 | 平均 | |||||
| × | × | × | 6.7 | 2.52 | 96.2 | 94.6 | 87.8 | 81.5 | 87.9 | 82.8 | 78.0 | 87.7 |
| √ | × | × | 4.2 | 1.94 | 96.4 | 94.8 | 88.1 | 81.6 | 88.2 | 82.9 | 78.1 | 87.8 |
| × | √ | × | 6.7 | 2.63 | 96.5 | 95.1 | 88.6 | 83.0 | 87.9 | 83.4 | 79.1 | 88.2 |
| × | × | √ | 5.9 | 2.31 | 96.6 | 94.8 | 88.2 | 81.7 | 88.4 | 82.9 | 78.2 | 87.9 |
| √ | × | √ | 3.4 | 1.77 | 96.4 | 94.6 | 87.9 | 81.6 | 88.1 | 82.8 | 78.0 | 88.1 |
| × | √ | √ | 5.3 | 2.17 | 96.7 | 95.2 | 88.6 | 82.9 | 88.4 | 83.6 | 79.3 | 88.4 |
| √ | √ | × | 4.2 | 2.01 | 96.5 | 95.2 | 88.6 | 82.8 | 88.2 | 83.5 | 79.2 | 88.3 |
| √ | √ | √ | 3.4 | 1.81 | 96.8 | 95.3 | 88.8 | 83.2 | 88.5 | 84.1 | 80.2 | 88.7 |
表2 消融实验结果
Tab. 2 Ablation study results
| MSHM | SECA | SCPM | 参数量/106 | FLOPs/109 | 不同预测关键点的PCKh@0.5/% | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 头部 | 肩部 | 肘部 | 手腕 | 臀部 | 膝盖 | 脚踝 | 平均 | |||||
| × | × | × | 6.7 | 2.52 | 96.2 | 94.6 | 87.8 | 81.5 | 87.9 | 82.8 | 78.0 | 87.7 |
| √ | × | × | 4.2 | 1.94 | 96.4 | 94.8 | 88.1 | 81.6 | 88.2 | 82.9 | 78.1 | 87.8 |
| × | √ | × | 6.7 | 2.63 | 96.5 | 95.1 | 88.6 | 83.0 | 87.9 | 83.4 | 79.1 | 88.2 |
| × | × | √ | 5.9 | 2.31 | 96.6 | 94.8 | 88.2 | 81.7 | 88.4 | 82.9 | 78.2 | 87.9 |
| √ | × | √ | 3.4 | 1.77 | 96.4 | 94.6 | 87.9 | 81.6 | 88.1 | 82.8 | 78.0 | 88.1 |
| × | √ | √ | 5.3 | 2.17 | 96.7 | 95.2 | 88.6 | 82.9 | 88.4 | 83.6 | 79.3 | 88.4 |
| √ | √ | × | 4.2 | 2.01 | 96.5 | 95.2 | 88.6 | 82.8 | 88.2 | 83.5 | 79.2 | 88.3 |
| √ | √ | √ | 3.4 | 1.81 | 96.8 | 95.3 | 88.8 | 83.2 | 88.5 | 84.1 | 80.2 | 88.7 |
| 网络类型 | 网络 | 参数量/106 | FLOPs/109 | 不同预测关键点的PCKh@0.5/% | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| 头部 | 肩部 | 肘部 | 手腕 | 臀部 | 膝盖 | 脚踝 | 平均 | ||||
| 大规模网络 | 2-SHNet[ | 6.70 | 2.52 | 96.2 | 94.6 | 87.8 | 81.5 | 87.9 | 82.8 | 78.0 | 87.7 |
| FLPN[ | 22.50 | 2.80 | 96.2 | 95.2 | 88.6 | 82.7 | 88.4 | 83.6 | 80.0 | 88.4 | |
| HRNet-MSSA[ | 28.50 | 10.30 | — | — | — | — | — | — | — | 91.5 | |
| MamKPD-B[ | 7.10 | 3.10 | — | — | — | — | — | — | — | 90.7 | |
| 轻量级网络 | Lightweight[ | 3.10 | 0.77 | 95.6 | 93.9 | 85.1 | 79.5 | 86.3 | 80.4 | 75.5 | 85.9 |
| EL-HRNet-32[ | 5.00 | 2.66 | 96.7 | 94.8 | 87.6 | 82.2 | 88.2 | 82.4 | 77.9 | 87.7 | |
| WideHRNet-18[ | 2.70 | 0.96 | — | — | — | — | — | — | — | 87.7 | |
| LMFormer-L[ | 4.10 | 1.90 | — | — | — | — | — | — | — | 87.6 | |
| HRNet-MSSA-Lite [ | 1.10 | 0.70 | — | — | — | — | — | — | — | 83.7 | |
| MobileMultiPose-L [ | 4.64 | 1.61 | — | — | — | — | — | — | — | 87.9 | |
| LE-SHNet | 3.40 | 1.81 | 96.8 | 95.3 | 88.8 | 83.2 | 88.5 | 84.1 | 80.2 | 88.7 | |
表3 MPII验证集上的对比实验结果
Tab. 3 Comparison experimental results on MPII validation set
| 网络类型 | 网络 | 参数量/106 | FLOPs/109 | 不同预测关键点的PCKh@0.5/% | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| 头部 | 肩部 | 肘部 | 手腕 | 臀部 | 膝盖 | 脚踝 | 平均 | ||||
| 大规模网络 | 2-SHNet[ | 6.70 | 2.52 | 96.2 | 94.6 | 87.8 | 81.5 | 87.9 | 82.8 | 78.0 | 87.7 |
| FLPN[ | 22.50 | 2.80 | 96.2 | 95.2 | 88.6 | 82.7 | 88.4 | 83.6 | 80.0 | 88.4 | |
| HRNet-MSSA[ | 28.50 | 10.30 | — | — | — | — | — | — | — | 91.5 | |
| MamKPD-B[ | 7.10 | 3.10 | — | — | — | — | — | — | — | 90.7 | |
| 轻量级网络 | Lightweight[ | 3.10 | 0.77 | 95.6 | 93.9 | 85.1 | 79.5 | 86.3 | 80.4 | 75.5 | 85.9 |
| EL-HRNet-32[ | 5.00 | 2.66 | 96.7 | 94.8 | 87.6 | 82.2 | 88.2 | 82.4 | 77.9 | 87.7 | |
| WideHRNet-18[ | 2.70 | 0.96 | — | — | — | — | — | — | — | 87.7 | |
| LMFormer-L[ | 4.10 | 1.90 | — | — | — | — | — | — | — | 87.6 | |
| HRNet-MSSA-Lite [ | 1.10 | 0.70 | — | — | — | — | — | — | — | 83.7 | |
| MobileMultiPose-L [ | 4.64 | 1.61 | — | — | — | — | — | — | — | 87.9 | |
| LE-SHNet | 3.40 | 1.81 | 96.8 | 95.3 | 88.8 | 83.2 | 88.5 | 84.1 | 80.2 | 88.7 | |
| 数据集 | 网络类型 | 网络名称 | 输入尺寸 | 参数量/106 | FLOPs/109 | AP/% | AP50/% | AP75/% | AR/% |
|---|---|---|---|---|---|---|---|---|---|
COCO2017 验证集 | 大规模网络 | 2-SHNet[ | 256×192 | 6.70 | 2.10 | 65.6 | 87.3 | 73.8 | 72.0 |
| HRPVT-L[ | 256×192 | 25.10 | 5.40 | 75.2 | 90.6 | 82.4 | 80.4 | ||
| MSPose-L[ | 256×192 | 27.50 | 11.00 | 76.0 | 90.5 | 82.7 | 81.2 | ||
| MamKPD-L[ | 256×192 | 12.40 | 4.30 | 77.3 | 90.8 | 83.4 | 82.1 | ||
| 轻量级网络 | Lightweight[ | 256×192 | 3.10 | 0.58 | 65.8 | 87.7 | 74.1 | 72.1 | |
| EL-HRNet-32 [ | 256×192 | 5.00 | 2.00 | 67.1 | 86.4 | 74.2 | 74.9 | ||
| HRPVT-S[ | 256×192 | 4.80 | 1.10 | 69.7 | 88.4 | 77.6 | 75.1 | ||
| LMFormer-L [ | 256×192 | 4.10 | 1.40 | 68.9 | 88.3 | 76.4 | 74.7 | ||
| MamKPD-S[ | 256×192 | 6.30 | 0.50 | 75.2 | 90.4 | 82.2 | 75.3 | ||
| MSPose-T[ | 256×192 | 5.80 | 1.30 | 67.1 | 87.3 | 75.3 | 73.4 | ||
| MobileMultiPose-L[ | 256×192 | 4.64 | 1.17 | 70.4 | 89.0 | 78.5 | 76.3 | ||
| LE-SHNet | 256×192 | 3.40 | 1.42 | 71.3 | 89.0 | 78.2 | 77.1 | ||
COCO2017 测试-开发集 | 大规模网络 | 2-SHNet[ | 256×192 | 6.70 | 2.10 | 65.1 | 89.5 | 73.2 | 71.0 |
| SimpleBaseline [ | 256×192 | 34.00 | 8.90 | 70.0 | 90.9 | 77.9 | 75.6 | ||
| MobileNetV2 [ | 256×192 | 9.60 | 1.48 | 64.1 | 89.4 | 71.8 | 70.1 | ||
| ShuffleNet V2 [ | 256×192 | 7.60 | 1.30 | 59.5 | 87.4 | 66.0 | 66.0 | ||
| 轻量级网络 | Lite-HRNet[ | 256×192 | 1.10 | 0.20 | 63.7 | 88.6 | 71.1 | 69.7 | |
| Lightweight[ | 256×192 | 3.10 | 0.58 | 65.3 | 89.7 | 73.4 | 71.3 | ||
| EL-HRNet[ | 256×192 | 5.00 | 2.00 | 67.7 | 89.7 | 75.5 | 74.4 | ||
| LE-SHNet | 256×192 | 3.40 | 1.42 | 70.7 | 90.8 | 78.5 | 76.5 |
表4 COCO2017验证集上的对比实验结果
Tab. 4 Comparison experimental results on COCO2017 validation set
| 数据集 | 网络类型 | 网络名称 | 输入尺寸 | 参数量/106 | FLOPs/109 | AP/% | AP50/% | AP75/% | AR/% |
|---|---|---|---|---|---|---|---|---|---|
COCO2017 验证集 | 大规模网络 | 2-SHNet[ | 256×192 | 6.70 | 2.10 | 65.6 | 87.3 | 73.8 | 72.0 |
| HRPVT-L[ | 256×192 | 25.10 | 5.40 | 75.2 | 90.6 | 82.4 | 80.4 | ||
| MSPose-L[ | 256×192 | 27.50 | 11.00 | 76.0 | 90.5 | 82.7 | 81.2 | ||
| MamKPD-L[ | 256×192 | 12.40 | 4.30 | 77.3 | 90.8 | 83.4 | 82.1 | ||
| 轻量级网络 | Lightweight[ | 256×192 | 3.10 | 0.58 | 65.8 | 87.7 | 74.1 | 72.1 | |
| EL-HRNet-32 [ | 256×192 | 5.00 | 2.00 | 67.1 | 86.4 | 74.2 | 74.9 | ||
| HRPVT-S[ | 256×192 | 4.80 | 1.10 | 69.7 | 88.4 | 77.6 | 75.1 | ||
| LMFormer-L [ | 256×192 | 4.10 | 1.40 | 68.9 | 88.3 | 76.4 | 74.7 | ||
| MamKPD-S[ | 256×192 | 6.30 | 0.50 | 75.2 | 90.4 | 82.2 | 75.3 | ||
| MSPose-T[ | 256×192 | 5.80 | 1.30 | 67.1 | 87.3 | 75.3 | 73.4 | ||
| MobileMultiPose-L[ | 256×192 | 4.64 | 1.17 | 70.4 | 89.0 | 78.5 | 76.3 | ||
| LE-SHNet | 256×192 | 3.40 | 1.42 | 71.3 | 89.0 | 78.2 | 77.1 | ||
COCO2017 测试-开发集 | 大规模网络 | 2-SHNet[ | 256×192 | 6.70 | 2.10 | 65.1 | 89.5 | 73.2 | 71.0 |
| SimpleBaseline [ | 256×192 | 34.00 | 8.90 | 70.0 | 90.9 | 77.9 | 75.6 | ||
| MobileNetV2 [ | 256×192 | 9.60 | 1.48 | 64.1 | 89.4 | 71.8 | 70.1 | ||
| ShuffleNet V2 [ | 256×192 | 7.60 | 1.30 | 59.5 | 87.4 | 66.0 | 66.0 | ||
| 轻量级网络 | Lite-HRNet[ | 256×192 | 1.10 | 0.20 | 63.7 | 88.6 | 71.1 | 69.7 | |
| Lightweight[ | 256×192 | 3.10 | 0.58 | 65.3 | 89.7 | 73.4 | 71.3 | ||
| EL-HRNet[ | 256×192 | 5.00 | 2.00 | 67.7 | 89.7 | 75.5 | 74.4 | ||
| LE-SHNet | 256×192 | 3.40 | 1.42 | 70.7 | 90.8 | 78.5 | 76.5 |
| 网络 | 输入尺寸 | AP/% | 边缘设备上的推理时间/ms | CPU设备上的推理时间/ms |
|---|---|---|---|---|
| 2-SHNet[ | 256×192 | 65.6 | 24.26 | 15.08 |
| RSN-18[ | 256×192 | 70.4 | 21.24 | 11.99 |
| SimCC[ | 256×192 | 68.6 | 22.75 | 26.69 |
| RTMPose-S[ | 256×192 | 68.5 | 16.65 | 8.63 |
| EdgeNet-S[ | 256×192 | 69.5 | 19.26 | 12.63 |
| LE-SHNet | 256×192 | 71.3 | 15.76 | 6.87 |
表5 推理速度的对比实验结果
Tab.5 Comparison experimental results of inference speed
| 网络 | 输入尺寸 | AP/% | 边缘设备上的推理时间/ms | CPU设备上的推理时间/ms |
|---|---|---|---|---|
| 2-SHNet[ | 256×192 | 65.6 | 24.26 | 15.08 |
| RSN-18[ | 256×192 | 70.4 | 21.24 | 11.99 |
| SimCC[ | 256×192 | 68.6 | 22.75 | 26.69 |
| RTMPose-S[ | 256×192 | 68.5 | 16.65 | 8.63 |
| EdgeNet-S[ | 256×192 | 69.5 | 19.26 | 12.63 |
| LE-SHNet | 256×192 | 71.3 | 15.76 | 6.87 |
| [1] | 陈俊颖,郭士杰,陈玲玲. 基于解耦注意力与幻影卷积的轻量级人体姿态估计[J]. 计算机应用, 2025, 45(1): 223-233. |
| CHEN J Y, GUO S J, CHEN L L. Lightweight human pose estimation based on decoupled attention and ghost convolution[J]. Journal of Computer Applications, 2025, 45(1): 223-233. | |
| [2] | NEWELL A, YANG K, DENG J. Stacked hourglass networks for human pose estimation[C]// Proceedings of the 2016 European Conference on Computer Vision, LNCS 9912. Cham: Springer, 2016: 483-499. |
| [3] | KIM S T, LEE H J. Lightweight stacked hourglass network for human pose estimation[J]. Applied Sciences, 2020, 10(18): 6497. |
| [4] | CHOLLET F. Xception: deep learning with depthwise separable convolutions[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 1800-1807. |
| [5] | ZHANG Q, JIANG Z, LU Q, et al. Split to be slim: an overlooked redundancy in vanilla convolution[C]// Proceedings of the 29th International Joint Conference on Artificial Intelligence. California: ijcai.org, 2020: 3195-3201. |
| [6] | LI J, WEN Y, HE L. SCConv: spatial and channel reconstruction convolution for feature redundancy[C]// Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2023: 6153-6162. |
| [7] | MISRA D, NALAMADA T, ARASANIPALAI A U, et al. Rotate to attend: convolutional triplet attention module[C]// Proceedings of the 2021 IEEE Winter Conference on Applications of Computer Vision. Piscataway: IEEE, 2021: 3138-3147. |
| [8] | WANG Q, WU B, ZHU P, et al. ECA-Net: efficient channel attention for deep convolutional neural networks[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 11531-11539. |
| [9] | ZHANG Q L, YANG Y B. SA-Net: shuffle attention for deep convolutional neural networks[C]// Proceedings of the 2021 IEEE International Conference on Acoustics, Speech and Signal Processing. Piscataway: IEEE, 2021: 2235-2239. |
| [10] | ANDRILUKA M, PISHCHULIN L, GEHLER P, et al. 2D human pose estimation: new benchmark and state of the art analysis[C]// Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2014: 3686-3693. |
| [11] | LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: common objects in context[C]// Proceedings of the 2014 European Conference on Computer Vision, LNCS 8693. Cham: Springer, 2014: 740-755. |
| [12] | ANDRILUKA M, ROTH S, SCHIELE B. Monocular 3D pose estimation and tracking by detection[C]// Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2010: 623-630. |
| [13] | FISCHLER M A, ELSCHLAGER R A. The representation and matching of pictorial structures[J]. IEEE Transactions on Computers, 1973, C-22(1): 67-92. |
| [14] | FELZENSZWALB P F, HUTTENLOCHER D P. Pictorial structures for object recognition[J]. International Journal of Computer Vision, 2005, 61(1): 55-79. |
| [15] | ESMAIL M A, WANG J, WANG Y, et al. Resource-aware strategies for real-time multi-person pose estimation[J]. Image and Vision Computing, 2025, 155: No.105441. |
| [16] | LI B, TANG S, LI W. Mobile-friendly and multi-feature aggregation via Transformer for human pose estimation[J]. Image and Vision Computing, 2025, 153: No.105343. |
| [17] | DAI Q, LING Q. Hybrid representation learning for end-to-end multi-person pose estimation[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2025, 35(7): 6437-6451. |
| [18] | LV C, MA G. PoseNet++: a multi-scale and optimized feature extraction network for high-precision human pose estimation[J]. PLoS ONE, 2025, 20(6): No.e0326232. |
| [19] | HUA G, LI L, LIU S. Multipath affinage stacked-hourglass networks for human pose estimation[J]. Frontiers of Computer Science, 2020, 14(4): No.144701. |
| [20] | XIAO Y, YU D, WANG X, et al. SPCNet: spatial preserve and content-aware network for human pose estimation[C]// Proceedings of the 24th European Conference on Artificial Intelligence. Amsterdam: IOS Press, 2020: 2776-2783. |
| [21] | BAO W, YANG Y, LIANG D, et al. Multi-residual module stacked hourglass networks for human pose estimation[J]. Journal of Beijing Institute of Technology, 2020, 29(1): 110-119. |
| [22] | ZOU X, BI X, YU C. Improving human pose estimation based on stacked hourglass network[J]. Neural Processing Letters, 2023, 55(7): 9521-9544. |
| [23] | REN H, WANG W, ZHANG K, et al. Fast and lightweight human pose estimation[J]. IEEE Access, 2021, 9: 49576-49589. |
| [24] | ZHANG T, LI Q, WEN J, et al. Enhancement and optimisation of human pose estimation with multi-scale spatial attention and adversarial data augmentation[J]. Information Fusion, 2024, 111: No.102522. |
| [25] | DANG Y, LIU L, KANG H, et al. MamKPD: a simple mamba baseline for real-time 2D keypoint detection[EB/OL]. [2025-06-23].. |
| [26] | LI S, XIANG X. Lightweight human pose estimation using heatmap-weighting loss[EB/OL]. [2025-06-23].. |
| [27] | LI R, YAN A, YANG S, et al. Human pose estimation based on Efficient and Lightweight High-Resolution Network (EL-HRNet)[J]. Sensors, 2024, 24(2): No.396. |
| [28] | SAMKARI E, ARIF M, AlGHAMDI M, et al. WideHRNet: an efficient model for human pose estimation using wide channels in lightweight high-resolution network[J]. IEEE Access, 2024, 12: 148990-149000. |
| [29] | LI B, TANG S, LI W. LMFormer: lightweight and multi-feature perspective via Transformer for human pose estimation[J]. Neurocomputing, 2024, 594: No.127884. |
| [30] | XU Z, DAI M, ZHANG Q, et al. HRPVT: high-resolution pyramid vision Transformer for medium and small-scale human pose estimation[J]. Neurocomputing, 2025, 619: No.129154. |
| [31] | YUAN X, CHENG P, HAN S. Multi-supervision Transformer combining bounding box and mask for data-limited pose estimation[J]. Neurocomputing, 2024, 571: No.127209. |
| [32] | XIAO B, WU H, WEI Y. Simple baselines for human pose estimation and tracking[C]// Proceedings of the 2018 European Conference on Computer Vision, LNCS 11210. Cham: Springer, 2018: 472-487. |
| [33] | SANDLER M, HOWARD A, ZHU M, et al. MobileNetV2: inverted residuals and linear bottlenecks[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 4510-4520. |
| [34] | MA N, ZHANG X, ZHENG H T, et al. ShuffleNet V2: practical guidelines for efficient CNN architecture design[C]// Proceedings of the 2018 European Conference on Computer Vision, LNCS 11218. Cham: Springer, 2018: 122-138. |
| [35] | YU C, XIAO B, GAO C, et al. Lite-HRNet: a lightweight high-resolution network[C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 10435-10445. |
| [36] | CAI Y, WANG Z, LUO Z, et al. Learning delicate local representations for multi-person pose estimation[C]// Proceedings of the 2020 European Conference on Computer Vision, LNCS 12348. Cham: Springer, 2020: 455-472. |
| [37] | LI Y, YANG S, LIU P, et al. SimCC: a simple coordinate classification perspective for human pose estimation[C]// Proceedings of the 2022 European Conference on Computer Vision, LNCS 13666. Cham: Springer, 2022: 89-106. |
| [38] | JIANG T, LU P, ZHANG L, et al. RTMPose: real-time multi-person pose estimation based on MMPose[EB/OL]. [2025-06-23].. |
| [39] | ZHANG L, HUANG W, ZHENG J, et al. EdgePose: real-time human pose estimation scheme for industrial scenes[J]. IEEE Access, 2024, 12: 156702-156716. |
| [1] | 张金萧, 李成龙, 高新燕, 张铭. 基于时空特征金字塔网络与多假设交互机制的三维人体姿态估计模型[J]. 《计算机应用》唯一官方网站, 2026, 46(6): 1965-1972. |
| [2] | 尹秋燕, 丁婧, 聂志刚. 无人机航拍视角下的人体姿态估计算法YOLO-AirPose[J]. 《计算机应用》唯一官方网站, 2026, 46(6): 1989-1997. |
| [3] | 吴闵奇, 杨元华, 李航, 胡雅琴, 汤智豪, 梅腾. 基于图Transformer和RT-DETR的轻量化水下小目标检测[J]. 《计算机应用》唯一官方网站, 2026, 46(5): 1586-1595. |
| [4] | 张红瑞, 冯威铭, 杨潞霞, 马永杰. 基于YOLO11改进的水下小目标检测算法CSAF-YOLO[J]. 《计算机应用》唯一官方网站, 2026, 46(5): 1578-1585. |
| [5] | 严心怡, 朱灵龙, 张永宏. 面向复杂交通场景的多尺度实时人车检测方法CDC-DETR[J]. 《计算机应用》唯一官方网站, 2026, 46(4): 1283-1291. |
| [6] | 刘汉卿, 桑国明, 张益嘉. 结合密集多尺度特征融合和特征知识增强Transformer的遥感图像描述模型[J]. 《计算机应用》唯一官方网站, 2026, 46(3): 741-749. |
| [7] | 梁一鸣, 范菁, 柴汶泽. 基于双向交叉注意力的多尺度特征融合情感分类[J]. 《计算机应用》唯一官方网站, 2025, 45(9): 2773-2782. |
| [8] | 陈亮, 王璇, 雷坤. 复杂场景下跨层多尺度特征融合的安全帽佩戴检测算法[J]. 《计算机应用》唯一官方网站, 2025, 45(7): 2333-2341. |
| [9] | 王向, 崔倩倩, 张晓明, 王建超, 王震洲, 宋佳霖. 改进ConvNeXt的无线胶囊内镜图像分类模型[J]. 《计算机应用》唯一官方网站, 2025, 45(6): 2016-2024. |
| [10] | 郭诗月, 党建武, 王阳萍, 雍玖. 结合注意力机制和多尺度特征融合的三维手部姿态估计[J]. 《计算机应用》唯一官方网站, 2025, 45(4): 1293-1299. |
| [11] | 张众维, 王俊, 刘树东, 王志恒. 多尺度特征融合与加权框融合的遥感图像目标检测[J]. 《计算机应用》唯一官方网站, 2025, 45(2): 633-639. |
| [12] | 李卓然, 李华, 王桐, 蒋朝哲. 基于融合特征状态空间模型的轻量化人体姿态估计[J]. 《计算机应用》唯一官方网站, 2025, 45(10): 3179-3186. |
| [13] | 曾正东, 赵明. 基于图注意力机制的三维人体姿态估计时空上下文网络[J]. 《计算机应用》唯一官方网站, 2025, 45(10): 3161-3169. |
| [14] | 尹学辉, 傅林琳, 周尚波. 渐进式上下文交互和注意力机制的混凝土路面裂缝检测网络[J]. 《计算机应用》唯一官方网站, 2025, 45(10): 3353-3362. |
| [15] | 陈俊颖, 郭士杰, 陈玲玲. 基于解耦注意力与幻影卷积的轻量级人体姿态估计[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 223-233. |
| 阅读次数 | ||||||
|
全文 |
|
|||||
|
摘要 |
|
|||||
