Journal of Computer Applications ›› 2025, Vol. 45 ›› Issue (1): 223-233.DOI: 10.11772/j.issn.1001-9081.2024010099
• Multimedia computing and computer simulation • Previous Articles Next Articles
Junying CHEN1, Shijie GUO1,2,3(), Lingling CHEN4
Received:
2024-01-26
Revised:
2024-03-25
Accepted:
2024-03-25
Online:
2024-05-09
Published:
2025-01-10
Contact:
Shijie GUO
About author:
CHEN Junying, born in 2000, M. S. candidate. His research interests include computer vision, human pose estimation.Supported by:
通讯作者:
郭士杰
作者简介:
陈俊颖(2000—),男,湖南常德人,硕士研究生,主要研究方向:计算机视觉、人体姿态估计;基金资助:
CLC Number:
Junying CHEN, Shijie GUO, Lingling CHEN. Lightweight human pose estimation based on decoupled attention and ghost convolution[J]. Journal of Computer Applications, 2025, 45(1): 223-233.
陈俊颖, 郭士杰, 陈玲玲. 基于解耦注意力与幻影卷积的轻量级人体姿态估计[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 223-233.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2024010099
层 | 输出尺寸 | 操作算子 | 分辨率分支 | 输出通道数 | 重复次数 | 模块数 | |
---|---|---|---|---|---|---|---|
DGLNet-18 | DGLNet-30 | ||||||
Image | 256×256 | 1× | 3 | ||||
stem | 64×64 | conv2d | 2× | 32 | 1 | 1 | 1 |
DFDbottleneck | 4× | 32 | 1 | ||||
stage2 | 64×64 | DGBblock | 4×8× | 40,80 | 2 | 2 | 3 |
GSCtransition | 4×8× | 40,80 | 1 | ||||
stage3 | 64×64 | DGBblock | 4×8×16× | 40,80,160 | 2 | 4 | 8 |
GSCtransition | 4×8×16× | 40,80,160 | 1 | ||||
stage4 | 64×64 | DGBblock | 4×8×16×32× | 40,80,160,320 | 2 | 2 | 3 |
GSCtransition | 4×8×16×32× | 40,80,160,320 | 1 |
Tab. 1 Information of each layer module in DGLNet
层 | 输出尺寸 | 操作算子 | 分辨率分支 | 输出通道数 | 重复次数 | 模块数 | |
---|---|---|---|---|---|---|---|
DGLNet-18 | DGLNet-30 | ||||||
Image | 256×256 | 1× | 3 | ||||
stem | 64×64 | conv2d | 2× | 32 | 1 | 1 | 1 |
DFDbottleneck | 4× | 32 | 1 | ||||
stage2 | 64×64 | DGBblock | 4×8× | 40,80 | 2 | 2 | 3 |
GSCtransition | 4×8× | 40,80 | 1 | ||||
stage3 | 64×64 | DGBblock | 4×8×16× | 40,80,160 | 2 | 4 | 8 |
GSCtransition | 4×8×16× | 40,80,160 | 1 | ||||
stage4 | 64×64 | DGBblock | 4×8×16×32× | 40,80,160,320 | 2 | 2 | 3 |
GSCtransition | 4×8×16×32× | 40,80,160,320 | 1 |
模型 | 骨干网络 | 输入尺寸 | 参数量/106 | 计算量/GFLOPs | 评估指标值/% | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|
AP | AP50 | AP75 | APM | APL | AR | ||||||
大型网络 | Hourglass[ | Hourglass | 256×192 | 25.1 | 14.3 | 66.9 | — | — | — | — | — |
CPN[ | ResNet-50 | 256×192 | 27.0 | 6.2 | 68.6 | — | — | — | — | — | |
SimpleBaseline[ | ResNet-50 | 256×192 | 34.0 | 8.9 | 70.4 | 88.6 | 78.3 | 67.1 | 77.2 | 76.3 | |
HRNetV1[ | HRNetV1-W32 | 256×192 | 28.5 | 7.1 | 73.4 | 89.5 | 80.7 | 70.2 | 80.1 | 78.9 | |
DARK[ | HRNetV1-W48 | 128×96 | 63.6 | 3.6 | 71.9 | 89.1 | 79.6 | 69.2 | 78.0 | 77.9 | |
小型网络 | MobileNetV2[ | MobileNetV2 | 256×192 | 9.6 | 1.4 | 64.6 | 87.4 | 72.3 | 61.1 | 71.2 | 70.7 |
MobileNetV2 | MobileNetV2 | 384×288 | 9.6 | 3.3 | 67.3 | 87.9 | 74.3 | 62.8 | 74.7 | 72.9 | |
ShuffleNetV2[ | ShuffleNetV2 | 256×192 | 7.6 | 1.2 | 59.9 | 85.4 | 66.3 | 56.6 | 66.2 | 66.4 | |
ShuffleNetV2 | ShuffleNetV2 | 384×288 | 7.6 | 2.8 | 63.6 | 86.5 | 70.5 | 59.5 | 70.7 | 69.7 | |
Small HRNet[ | HRNet-W16 | 256×192 | 1.3 | 0.5 | 55.2 | 83.7 | 62.4 | 52.3 | 61.0 | 62.1 | |
Small HRNet | HRNet-W16 | 384×288 | 1.3 | 1.2 | 56.0 | 83.8 | 63.0 | 52.4 | 62.6 | 62.6 | |
DY-MobileNetV2[ | DY-MobileNetV2 | 256×192 | 16.1 | 1.0 | 68.2 | 88.4 | 76.0 | 65.0 | 74.7 | 74.2 | |
DY-RELU[ | MobileNetV2 | 256×192 | 9.0 | 1.0 | 68.1 | 88.5 | 76.2 | 64.8 | 74.3 | — | |
Lite-HRNet[ | Lite-HRNet-18 | 256×192 | 1.1 | 0.2 | 64.8 | 86.7 | 73.0 | 62.1 | 70.5 | 71.2 | |
Lite-HRNet | Lite-HRNet-18 | 384×288 | 1.1 | 0.4 | 67.6 | 87.8 | 75.0 | 64.5 | 73.7 | 73.7 | |
Lite-HRNet | Lite-HRNet-30 | 256×192 | 1.8 | 0.3 | 67.2 | 88.0 | 75.0 | 64.3 | 73.1 | 73.3 | |
Lite-HRNet | Lite-HRNet-30 | 384×288 | 1.8 | 0.7 | 70.4 | 88.7 | 77.7 | 67.5 | 76.3 | 76.2 | |
DGLNet | DGLNet-18 | 256×192 | 1.1 | 0.2 | 66.1 | 89.4 | 73.2 | 64.0 | 71.9 | 71.8 | |
DGLNet-18 | 384×288 | 1.1 | 0.4 | 68.5 | 89.5 | 76.0 | 65.9 | 73.9 | 74.1 | ||
DGLNet-30 | 256×192 | 1.8 | 0.3 | 68.4 | 89.7 | 76.1 | 65.9 | 74.2 | 73.8 | ||
DGLNet-30 | 384×288 | 1.8 | 0.7 | 71.9 | 89.9 | 78.2 | 68.8 | 77.3 | 76.9 |
Tab. 2 Performance comparison on COCO validation set
模型 | 骨干网络 | 输入尺寸 | 参数量/106 | 计算量/GFLOPs | 评估指标值/% | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|
AP | AP50 | AP75 | APM | APL | AR | ||||||
大型网络 | Hourglass[ | Hourglass | 256×192 | 25.1 | 14.3 | 66.9 | — | — | — | — | — |
CPN[ | ResNet-50 | 256×192 | 27.0 | 6.2 | 68.6 | — | — | — | — | — | |
SimpleBaseline[ | ResNet-50 | 256×192 | 34.0 | 8.9 | 70.4 | 88.6 | 78.3 | 67.1 | 77.2 | 76.3 | |
HRNetV1[ | HRNetV1-W32 | 256×192 | 28.5 | 7.1 | 73.4 | 89.5 | 80.7 | 70.2 | 80.1 | 78.9 | |
DARK[ | HRNetV1-W48 | 128×96 | 63.6 | 3.6 | 71.9 | 89.1 | 79.6 | 69.2 | 78.0 | 77.9 | |
小型网络 | MobileNetV2[ | MobileNetV2 | 256×192 | 9.6 | 1.4 | 64.6 | 87.4 | 72.3 | 61.1 | 71.2 | 70.7 |
MobileNetV2 | MobileNetV2 | 384×288 | 9.6 | 3.3 | 67.3 | 87.9 | 74.3 | 62.8 | 74.7 | 72.9 | |
ShuffleNetV2[ | ShuffleNetV2 | 256×192 | 7.6 | 1.2 | 59.9 | 85.4 | 66.3 | 56.6 | 66.2 | 66.4 | |
ShuffleNetV2 | ShuffleNetV2 | 384×288 | 7.6 | 2.8 | 63.6 | 86.5 | 70.5 | 59.5 | 70.7 | 69.7 | |
Small HRNet[ | HRNet-W16 | 256×192 | 1.3 | 0.5 | 55.2 | 83.7 | 62.4 | 52.3 | 61.0 | 62.1 | |
Small HRNet | HRNet-W16 | 384×288 | 1.3 | 1.2 | 56.0 | 83.8 | 63.0 | 52.4 | 62.6 | 62.6 | |
DY-MobileNetV2[ | DY-MobileNetV2 | 256×192 | 16.1 | 1.0 | 68.2 | 88.4 | 76.0 | 65.0 | 74.7 | 74.2 | |
DY-RELU[ | MobileNetV2 | 256×192 | 9.0 | 1.0 | 68.1 | 88.5 | 76.2 | 64.8 | 74.3 | — | |
Lite-HRNet[ | Lite-HRNet-18 | 256×192 | 1.1 | 0.2 | 64.8 | 86.7 | 73.0 | 62.1 | 70.5 | 71.2 | |
Lite-HRNet | Lite-HRNet-18 | 384×288 | 1.1 | 0.4 | 67.6 | 87.8 | 75.0 | 64.5 | 73.7 | 73.7 | |
Lite-HRNet | Lite-HRNet-30 | 256×192 | 1.8 | 0.3 | 67.2 | 88.0 | 75.0 | 64.3 | 73.1 | 73.3 | |
Lite-HRNet | Lite-HRNet-30 | 384×288 | 1.8 | 0.7 | 70.4 | 88.7 | 77.7 | 67.5 | 76.3 | 76.2 | |
DGLNet | DGLNet-18 | 256×192 | 1.1 | 0.2 | 66.1 | 89.4 | 73.2 | 64.0 | 71.9 | 71.8 | |
DGLNet-18 | 384×288 | 1.1 | 0.4 | 68.5 | 89.5 | 76.0 | 65.9 | 73.9 | 74.1 | ||
DGLNet-30 | 256×192 | 1.8 | 0.3 | 68.4 | 89.7 | 76.1 | 65.9 | 74.2 | 73.8 | ||
DGLNet-30 | 384×288 | 1.8 | 0.7 | 71.9 | 89.9 | 78.2 | 68.8 | 77.3 | 76.9 |
模型 | 骨干网络 | 输入尺寸 | 参数量/106 | 计算量/GFLOPs | 评估指标值/% | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|
AP | AP50 | AP75 | APM | APL | AR | ||||||
大型网络 | SimpleBaseline[ | ResNet-50 | 256×192 | 34.0 | 8.9 | 70.0 | 90.9 | 77.9 | 66.8 | 75.8 | 75.6 |
CPN[ | ResNet-Inception | 384×288 | — | — | 72.1 | 91.4 | 80.0 | 68.7 | 77.2 | 78.5 | |
HRNetV1[ | HRNetV1-W32 | 384×288 | 28.5 | 16.0 | 74.9 | 92.5 | 82.8 | 71.3 | 80.9 | 80.1 | |
DARK[ | HRNetV1-W48 | 384×288 | 63.6 | 32.9 | 76.2 | 92.5 | 83.6 | 72.5 | 82.4 | 81.1 | |
小型网络 | MobileNetV2[ | MobileNetV2 | 384×288 | 9.8 | 3.3 | 66.8 | 90.0 | 74.0 | 62.6 | 73.3 | 72.3 |
ShuffleNetV2[ | ShuffleNetV2 | 384×288 | 7.6 | 2.8 | 62.9 | 88.5 | 69.4 | 58.9 | 69.3 | 68.9 | |
Small HRNet[ | HRNet-W16 | 384×288 | 1.3 | 1.2 | 55.2 | 85.8 | 61.4 | 51.7 | 61.2 | 61.5 | |
Lite-HRNet[ | Lite-HRNet-18 | 384×288 | 1.1 | 0.4 | 66.9 | 89.4 | 74.4 | 64.0 | 72.2 | 72.6 | |
Lite-HRNet | Lite-HRNet-30 | 384×288 | 1.8 | 0.7 | 69.7 | 90.7 | 77.5 | 66.9 | 75.0 | 75.4 | |
DGLNet | DGLNet-18 | 384×288 | 1.1 | 0.4 | 68.6 | 90.1 | 75.7 | 65.3 | 74.0 | 74.4 | |
DGLNet-30 | 384×288 | 1.8 | 0.7 | 71.0 | 90.9 | 77.9 | 67.3 | 76.5 | 76.7 |
Tab. 3 Performance comparison on COCO test set
模型 | 骨干网络 | 输入尺寸 | 参数量/106 | 计算量/GFLOPs | 评估指标值/% | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|
AP | AP50 | AP75 | APM | APL | AR | ||||||
大型网络 | SimpleBaseline[ | ResNet-50 | 256×192 | 34.0 | 8.9 | 70.0 | 90.9 | 77.9 | 66.8 | 75.8 | 75.6 |
CPN[ | ResNet-Inception | 384×288 | — | — | 72.1 | 91.4 | 80.0 | 68.7 | 77.2 | 78.5 | |
HRNetV1[ | HRNetV1-W32 | 384×288 | 28.5 | 16.0 | 74.9 | 92.5 | 82.8 | 71.3 | 80.9 | 80.1 | |
DARK[ | HRNetV1-W48 | 384×288 | 63.6 | 32.9 | 76.2 | 92.5 | 83.6 | 72.5 | 82.4 | 81.1 | |
小型网络 | MobileNetV2[ | MobileNetV2 | 384×288 | 9.8 | 3.3 | 66.8 | 90.0 | 74.0 | 62.6 | 73.3 | 72.3 |
ShuffleNetV2[ | ShuffleNetV2 | 384×288 | 7.6 | 2.8 | 62.9 | 88.5 | 69.4 | 58.9 | 69.3 | 68.9 | |
Small HRNet[ | HRNet-W16 | 384×288 | 1.3 | 1.2 | 55.2 | 85.8 | 61.4 | 51.7 | 61.2 | 61.5 | |
Lite-HRNet[ | Lite-HRNet-18 | 384×288 | 1.1 | 0.4 | 66.9 | 89.4 | 74.4 | 64.0 | 72.2 | 72.6 | |
Lite-HRNet | Lite-HRNet-30 | 384×288 | 1.8 | 0.7 | 69.7 | 90.7 | 77.5 | 66.9 | 75.0 | 75.4 | |
DGLNet | DGLNet-18 | 384×288 | 1.1 | 0.4 | 68.6 | 90.1 | 75.7 | 65.3 | 74.0 | 74.4 | |
DGLNet-30 | 384×288 | 1.8 | 0.7 | 71.0 | 90.9 | 77.9 | 67.3 | 76.5 | 76.7 |
模型 | 参数量/106 | 计算量/GFLOPs | PCKh@0.5/% |
---|---|---|---|
MobileNetv2[ | 9.6 | 1.9 | 85.4 |
MobileNetv3[ | 8.7 | 1.8 | 84.3 |
ShuffleNetv2[ | 7.6 | 1.7 | 82.8 |
Small HRNet-W16 | 1.3 | 0.7 | 80.2 |
Lite-HRNet-18[ | 1.1 | 0.2 | 86.1 |
Lite-HRNet-30 | 1.8 | 0.4 | 87.0 |
DGLNet-18(本文模型) | 1.1 | 0.2 | 86.8 |
DGLNet-30(本文模型) | 1.8 | 0.4 | 87.7 |
Tab. 4 Performance comparison on MPII validation set (PCKh@0.5)
模型 | 参数量/106 | 计算量/GFLOPs | PCKh@0.5/% |
---|---|---|---|
MobileNetv2[ | 9.6 | 1.9 | 85.4 |
MobileNetv3[ | 8.7 | 1.8 | 84.3 |
ShuffleNetv2[ | 7.6 | 1.7 | 82.8 |
Small HRNet-W16 | 1.3 | 0.7 | 80.2 |
Lite-HRNet-18[ | 1.1 | 0.2 | 86.1 |
Lite-HRNet-30 | 1.8 | 0.4 | 87.0 |
DGLNet-18(本文模型) | 1.1 | 0.2 | 86.8 |
DGLNet-30(本文模型) | 1.8 | 0.4 | 87.7 |
模型 | 参数量/106 | 计算量/GFLOPs | AP |
---|---|---|---|
Small HRNet | 1.30 | 0.50 | 55.2 |
Small HRNet+DFDbottleneck | 1.34 | 0.51 | 59.8 |
Small HRNet+DGBblock | 1.13 | 0.34 | 61.7 |
Small HRNet+GSCtransition | 1.21 | 0.39 | 60.2 |
DGLNet-18 | 1.10 | 0.21 | 66.3 |
Tab. 5 Ablation experimental results of network lightweight module
模型 | 参数量/106 | 计算量/GFLOPs | AP |
---|---|---|---|
Small HRNet | 1.30 | 0.50 | 55.2 |
Small HRNet+DFDbottleneck | 1.34 | 0.51 | 59.8 |
Small HRNet+DGBblock | 1.13 | 0.34 | 61.7 |
Small HRNet+GSCtransition | 1.21 | 0.39 | 60.2 |
DGLNet-18 | 1.10 | 0.21 | 66.3 |
模型 | 参数量/106 | 计算量/GFLOPs | AP/% |
---|---|---|---|
Small HRNet | 1.30 | 0.50 | 55.2 |
Small HRNet+DFDbottleneck(有DFC) | 1.34 | 0.51 | 59.8 |
Small HRNet+DFDbottleneck(无DFC) | 1.29 | 0.49 | 59.3 |
Small HRNet+DGBblock(有DFC) | 1.13 | 0.34 | 61.7 |
Small HRNet+DGBblock(无DFC) | 1.01 | 0.25 | 60.9 |
Small HRNet+GSCtransition(有DFC) | 1.21 | 0.39 | 60.2 |
Small HRNet+GSCtransition(无DFC) | 1.12 | 0.35 | 59.5 |
Tab. 6 Ablation experimental results of decoupled attention module
模型 | 参数量/106 | 计算量/GFLOPs | AP/% |
---|---|---|---|
Small HRNet | 1.30 | 0.50 | 55.2 |
Small HRNet+DFDbottleneck(有DFC) | 1.34 | 0.51 | 59.8 |
Small HRNet+DFDbottleneck(无DFC) | 1.29 | 0.49 | 59.3 |
Small HRNet+DGBblock(有DFC) | 1.13 | 0.34 | 61.7 |
Small HRNet+DGBblock(无DFC) | 1.01 | 0.25 | 60.9 |
Small HRNet+GSCtransition(有DFC) | 1.21 | 0.39 | 60.2 |
Small HRNet+GSCtransition(无DFC) | 1.12 | 0.35 | 59.5 |
1 | ZHENG C, WU W, CHEN C, et al. Deep learning-based human pose estimation: a survey [J]. ACM Computing Surveys, 2023, 56(1): No.11. |
2 | XIAO B, WU H, WEI Y. Simple baselines for human pose estimation and tracking [C]// Proceedings of the 2018 European Conference on Computer Vision, LNCS 11210. Cham: Springer, 2018: 472-487. |
3 | WEI S E, RAMAKRISHNA V, KANADE T, et al. Convolutional pose machines [C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 4724-4732. |
4 | NEWELL A, YANG K, DENG J. Stacked hourglass networks for human pose estimation [C]// Proceedings of the 2016 European Conference on Computer Vision, LNCS 9912. Cham: Springer, 2016: 483-499. |
5 | CHU X, YANG W, OUYANG W, et al. Multi-context attention for human pose estimation [C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 5669-5678. |
6 | YANG W, LI S, OUYANG W, et al. Learning feature pyramids for human pose estimation [C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 1290-1299. |
7 | SUN K, XIAO B, LIU D, et al. Deep high-resolution representation learning for human pose estimation [C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 5686-5696. |
8 | HOWARD A G, ZHU M, CHEN B, et al. MobileNets: efficient convolutional neural networks for mobile vision applications [EB/OL]. [2024-02-08]. . |
9 | SANDLER M, HOWARD A, ZHU M, et al. MobileNetV2: inverted residuals and linear bottlenecks [C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 4510-4520. |
10 | HOWARD A, SANDLER M, CHEN B, et al. Searching for MobileNetV3 [C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 1314-1324. |
11 | ZHANG X, ZHOU X, LIN M, et al. ShuffleNet: an extremely efficient convolutional neural network for mobile devices [C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 6848-6856. |
12 | MA N, ZHANG X, ZHENG H T, et al. ShuffleNet V2: practical guidelines for efficient CNN architecture design [C]// Proceedings of the 2018 European Conference on Computer Vision, LNCS 11218. Cham: Springer, 2018: 122-138. |
13 | TAN M, LE Q V. EfficientNet: rethinking model scaling for convolutional neural networks [C]// Proceedings of the 36th International Conference on Machine Learning. New York: JMLR.org, 2019: 6105-6114. |
14 | TAN M, LE Q V. EfficientNetV2: smaller models and faster training [C]// Proceedings of the 38th International Conference on Machine Learning. New York: JMLR.org, 2021: 10096-10106. |
15 | CUI C, GAO T, WEI S, et al. PP-LCNet: a lightweight CPU convolutional neural network [EB/OL]. [2023-10-08]. . |
16 | TANG Y, HAN K, GUO J, et al. GhostNetV2: enhance cheap operation with long-range attention [C]// Proceedings of the 36th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2022: 9969-9982. |
17 | HAN K, WANG Y, TIAN Q, et al. GhostNet: more features from cheap operations [C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 1577-1586. |
18 | WANG J, SUN K, CHENG T, et al. Deep high-resolution representation learning for visual recognition [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(10): 3349-3364. |
19 | CHEN Y, WANG Z, PENG Y, et al. Cascaded pyramid network for multi-person pose estimation [C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 7103-7112. |
20 | FANG H S, XIE S, TAI Y W, et al. RMPE: regional multi-person pose estimation [C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 2353-2362. |
21 | CAI Y, WANG Z, LUO Z, et al. Learning delicate local representations for multi-person pose estimation [C]// Proceedings of the 2020 European Conference on Computer Vision, LNCS 12348. Cham: Springer, 2020: 455-472. |
22 | WANG J, LONG X, GAO Y, et al. Graph-PCNN: two stage human pose estimation with graph pose refinement [C]// Proceedings of the 2020 European Conference on Computer Vision, LNCS 12356. Cham: Springer, 2020: 492-508. |
23 | PAPANDREOU G, ZHU T, KANAZAWA N, et al. Towards accurate multi-person pose estimation in the wild [C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 3711-3719. |
24 | YU C, XIAO B, GAO C, et al. Lite-HRNet: a lightweight high-resolution network [C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 10435-10445. |
25 | PISHCHULIN L, INSAFUTDINOV E, TANG S, et al. DeepCut: joint subset partition and labeling for multi person pose estimation [C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 4929-4937. |
26 | INSAFUTDINOV E, PISHCHULIN L, ANDRES B, et al. DeeperCut: a deeper, stronger, and faster multi-person pose estimation model [C]// Proceedings of the 2016 European Conference on Computer Vision, LNCS 9910. Cham: Springer, 2016: 34-50. |
27 | CAO Z, SIMON T, WEI S E, et al. Realtime multi-person 2D pose estimation using part affinity fields [C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 1302-1310. |
28 | KREISS S, BERTONI L, ALAHI A. PifPaf: composite fields for human pose estimation [C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 11969-11978. |
29 | NEWELL A, HUANG Z, DENG J. Associative embedding: end-to-end learning for joint detection and grouping [C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2017: 2274-2284. |
30 | CHENG B, XIAO B, WANG J, et al. HigherHRNet: scale-aware representation learning for bottom-up human pose estimation [C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 5385-5394. |
31 | IANDOLA F N, HAN S, MOSKEWICZ M W, et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5 MB model size [EB/OL]. [2023-10-08]. . |
32 | ZHOU D, HOU Q, CHEN Y, et al. Rethinking bottleneck structure for efficient mobile network design [C]// Proceedings of the 2020 European Conference on Computer Vision, LNCS 12348. Cham: Springer, 2020: 680-697. |
33 | DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: Transformers for image recognition at scale [EB/OL]. [2023-11-10]. . |
34 | WANG X, GIRSHICK R, GUPTA A, et al. Non-local neural networks [C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 7794-7803. |
35 | LIU Z, LIN Y, CAO Y, et al. Swin Transformer: hierarchical vision transformer using shifted windows [C]// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 9992-10002. |
36 | MEHTA S, RASTEGARI M. MobileViT: light-weight, general-purpose, and mobile-friendly vision transformer [EB/OL]. [2023-09-08]. . |
37 | LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: common objects in context [C]// Proceedings of the 2014 European Conference on Computer Vision, LNCS 8693. Cham: Springer, 2014: 740-755. |
38 | ZHANG F, ZHU X, DAI H, et al. Distribution-aware coordinate representation for human pose estimation [C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 7091-7100. |
39 | CHEN Y, DAI X, LIU M, et al. Dynamic convolution: attention over convolution kernels [C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 11027-11036. |
40 | CHEN Y, DAI X, LIU M, et al. Dynamic ReLU [C]// Proceedings of the 2020 European Conference on Computer Vision, LNCS 12364. Cham: Springer, 2020: 351-367. |
41 | ANDRILUKA M, PISHCHULIN L, GEHLER P, et al. 2D human pose estimation: new benchmark and state of the art analysis [C]// Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2014: 3686-3693. |
[1] | Jialin ZHANG, Qinghua REN, Qirong MAO. Speaker verification system utilizing global-local feature dependency for anti-spoofing [J]. Journal of Computer Applications, 2025, 45(1): 308-317. |
[2] | Lifang WANG, Jingshuang WU, Pengliang YIN, Lihua HU. Action recognition algorithm based on attention mechanism and energy function [J]. Journal of Computer Applications, 2025, 45(1): 234-239. |
[3] | Ying HUANG, Changsheng LI, Hui PENG, Su LIU. Dual-branch network guided by local entropy for dynamic scene high dynamic range imaging [J]. Journal of Computer Applications, 2025, 45(1): 204-213. |
[4] | Jie XU, Yong ZHONG, Yang WANG, Changfu ZHANG, Guanci YANG. Facial attribute estimation and expression recognition based on contextual channel attention mechanism [J]. Journal of Computer Applications, 2025, 45(1): 253-260. |
[5] | Jing QIN, Zhiguang QIN, Fali LI, Yueheng PENG. Diagnosis of major depressive disorder based on probabilistic sparse self-attention neural network [J]. Journal of Computer Applications, 2024, 44(9): 2970-2974. |
[6] | Liting LI, Bei HUA, Ruozhou HE, Kuang XU. Multivariate time series prediction model based on decoupled attention mechanism [J]. Journal of Computer Applications, 2024, 44(9): 2732-2738. |
[7] | Zhiqiang ZHAO, Peihong MA, Xinhong HEI. Crowd counting method based on dual attention mechanism [J]. Journal of Computer Applications, 2024, 44(9): 2886-2892. |
[8] | Kaipeng XUE, Tao XU, Chunjie LIAO. Multimodal sentiment analysis network with self-supervision and multi-layer cross attention [J]. Journal of Computer Applications, 2024, 44(8): 2387-2392. |
[9] | Pengqi GAO, Heming HUANG, Yonghong FAN. Fusion of coordinate and multi-head attention mechanisms for interactive speech emotion recognition [J]. Journal of Computer Applications, 2024, 44(8): 2400-2406. |
[10] | Zhonghua LI, Yunqi BAI, Xuejin WANG, Leilei HUANG, Chujun LIN, Shiyu LIAO. Low illumination face detection based on image enhancement [J]. Journal of Computer Applications, 2024, 44(8): 2588-2594. |
[11] | Shangbin MO, Wenjun WANG, Ling DONG, Shengxiang GAO, Zhengtao YU. Single-channel speech enhancement based on multi-channel information aggregation and collaborative decoding [J]. Journal of Computer Applications, 2024, 44(8): 2611-2617. |
[12] | Li LIU, Haijin HOU, Anhong WANG, Tao ZHANG. Generative data hiding algorithm based on multi-scale attention [J]. Journal of Computer Applications, 2024, 44(7): 2102-2109. |
[13] | Song XU, Wenbo ZHANG, Yifan WANG. Lightweight video salient object detection network based on spatiotemporal information [J]. Journal of Computer Applications, 2024, 44(7): 2192-2199. |
[14] | Dahai LI, Zhonghua WANG, Zhendong WANG. Dual-branch low-light image enhancement network combining spatial and frequency domain information [J]. Journal of Computer Applications, 2024, 44(7): 2175-2182. |
[15] | Wenliang WEI, Yangping WANG, Biao YUE, Anzheng WANG, Zhe ZHANG. Deep learning model for infrared and visible image fusion based on illumination weight allocation and attention [J]. Journal of Computer Applications, 2024, 44(7): 2183-2191. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||