Journal of Computer Applications ›› 2025, Vol. 45 ›› Issue (1): 223-233.DOI: 10.11772/j.issn.1001-9081.2024010099
• Multimedia computing and computer simulation • Previous Articles Next Articles
					
						                                                                                                                                                                                                                    Junying CHEN1, Shijie GUO1,2,3( ), Lingling CHEN4
), Lingling CHEN4
												  
						
						
						
					
				
Received:2024-01-26
															
							
																	Revised:2024-03-25
															
							
																	Accepted:2024-03-25
															
							
							
																	Online:2024-05-09
															
							
																	Published:2025-01-10
															
							
						Contact:
								Shijie GUO   
													About author:CHEN Junying, born in 2000, M. S. candidate. His research interests include computer vision, human pose estimation.Supported by:通讯作者:
					郭士杰
							作者简介:陈俊颖(2000—),男,湖南常德人,硕士研究生,主要研究方向:计算机视觉、人体姿态估计;基金资助:CLC Number:
Junying CHEN, Shijie GUO, Lingling CHEN. Lightweight human pose estimation based on decoupled attention and ghost convolution[J]. Journal of Computer Applications, 2025, 45(1): 223-233.
陈俊颖, 郭士杰, 陈玲玲. 基于解耦注意力与幻影卷积的轻量级人体姿态估计[J]. 《计算机应用》唯一官方网站, 2025, 45(1): 223-233.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2024010099
| 层 | 输出尺寸 | 操作算子 | 分辨率分支 | 输出通道数 | 重复次数 | 模块数 | |
|---|---|---|---|---|---|---|---|
| DGLNet-18 | DGLNet-30 | ||||||
| Image | 256×256 | 1× | 3 | ||||
| stem | 64×64 | conv2d | 2× | 32 | 1 | 1 | 1 | 
| DFDbottleneck | 4× | 32 | 1 | ||||
| stage2 | 64×64 | DGBblock | 4×8× | 40,80 | 2 | 2 | 3 | 
| GSCtransition | 4×8× | 40,80 | 1 | ||||
| stage3 | 64×64 | DGBblock | 4×8×16× | 40,80,160 | 2 | 4 | 8 | 
| GSCtransition | 4×8×16× | 40,80,160 | 1 | ||||
| stage4 | 64×64 | DGBblock | 4×8×16×32× | 40,80,160,320 | 2 | 2 | 3 | 
| GSCtransition | 4×8×16×32× | 40,80,160,320 | 1 | ||||
Tab. 1 Information of each layer module in DGLNet
| 层 | 输出尺寸 | 操作算子 | 分辨率分支 | 输出通道数 | 重复次数 | 模块数 | |
|---|---|---|---|---|---|---|---|
| DGLNet-18 | DGLNet-30 | ||||||
| Image | 256×256 | 1× | 3 | ||||
| stem | 64×64 | conv2d | 2× | 32 | 1 | 1 | 1 | 
| DFDbottleneck | 4× | 32 | 1 | ||||
| stage2 | 64×64 | DGBblock | 4×8× | 40,80 | 2 | 2 | 3 | 
| GSCtransition | 4×8× | 40,80 | 1 | ||||
| stage3 | 64×64 | DGBblock | 4×8×16× | 40,80,160 | 2 | 4 | 8 | 
| GSCtransition | 4×8×16× | 40,80,160 | 1 | ||||
| stage4 | 64×64 | DGBblock | 4×8×16×32× | 40,80,160,320 | 2 | 2 | 3 | 
| GSCtransition | 4×8×16×32× | 40,80,160,320 | 1 | ||||
| 模型 | 骨干网络 | 输入尺寸 | 参数量/106 | 计算量/GFLOPs | 评估指标值/% | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| AP | AP50 | AP75 | APM | APL | AR | ||||||
| 大型网络 | Hourglass[ | Hourglass | 256×192 | 25.1 | 14.3 | 66.9 | — | — | — | — | — | 
| CPN[ | ResNet-50 | 256×192 | 27.0 | 6.2 | 68.6 | — | — | — | — | — | |
| SimpleBaseline[ | ResNet-50 | 256×192 | 34.0 | 8.9 | 70.4 | 88.6 | 78.3 | 67.1 | 77.2 | 76.3 | |
| HRNetV1[ | HRNetV1-W32 | 256×192 | 28.5 | 7.1 | 73.4 | 89.5 | 80.7 | 70.2 | 80.1 | 78.9 | |
| DARK[ | HRNetV1-W48 | 128×96 | 63.6 | 3.6 | 71.9 | 89.1 | 79.6 | 69.2 | 78.0 | 77.9 | |
| 小型网络 | MobileNetV2[ | MobileNetV2 | 256×192 | 9.6 | 1.4 | 64.6 | 87.4 | 72.3 | 61.1 | 71.2 | 70.7 | 
| MobileNetV2 | MobileNetV2 | 384×288 | 9.6 | 3.3 | 67.3 | 87.9 | 74.3 | 62.8 | 74.7 | 72.9 | |
| ShuffleNetV2[ | ShuffleNetV2 | 256×192 | 7.6 | 1.2 | 59.9 | 85.4 | 66.3 | 56.6 | 66.2 | 66.4 | |
| ShuffleNetV2 | ShuffleNetV2 | 384×288 | 7.6 | 2.8 | 63.6 | 86.5 | 70.5 | 59.5 | 70.7 | 69.7 | |
| Small HRNet[ | HRNet-W16 | 256×192 | 1.3 | 0.5 | 55.2 | 83.7 | 62.4 | 52.3 | 61.0 | 62.1 | |
| Small HRNet | HRNet-W16 | 384×288 | 1.3 | 1.2 | 56.0 | 83.8 | 63.0 | 52.4 | 62.6 | 62.6 | |
| DY-MobileNetV2[ | DY-MobileNetV2 | 256×192 | 16.1 | 1.0 | 68.2 | 88.4 | 76.0 | 65.0 | 74.7 | 74.2 | |
| DY-RELU[ | MobileNetV2 | 256×192 | 9.0 | 1.0 | 68.1 | 88.5 | 76.2 | 64.8 | 74.3 | — | |
| Lite-HRNet[ | Lite-HRNet-18 | 256×192 | 1.1 | 0.2 | 64.8 | 86.7 | 73.0 | 62.1 | 70.5 | 71.2 | |
| Lite-HRNet | Lite-HRNet-18 | 384×288 | 1.1 | 0.4 | 67.6 | 87.8 | 75.0 | 64.5 | 73.7 | 73.7 | |
| Lite-HRNet | Lite-HRNet-30 | 256×192 | 1.8 | 0.3 | 67.2 | 88.0 | 75.0 | 64.3 | 73.1 | 73.3 | |
| Lite-HRNet | Lite-HRNet-30 | 384×288 | 1.8 | 0.7 | 70.4 | 88.7 | 77.7 | 67.5 | 76.3 | 76.2 | |
| DGLNet | DGLNet-18 | 256×192 | 1.1 | 0.2 | 66.1 | 89.4 | 73.2 | 64.0 | 71.9 | 71.8 | |
| DGLNet-18 | 384×288 | 1.1 | 0.4 | 68.5 | 89.5 | 76.0 | 65.9 | 73.9 | 74.1 | ||
| DGLNet-30 | 256×192 | 1.8 | 0.3 | 68.4 | 89.7 | 76.1 | 65.9 | 74.2 | 73.8 | ||
| DGLNet-30 | 384×288 | 1.8 | 0.7 | 71.9 | 89.9 | 78.2 | 68.8 | 77.3 | 76.9 | ||
Tab. 2 Performance comparison on COCO validation set
| 模型 | 骨干网络 | 输入尺寸 | 参数量/106 | 计算量/GFLOPs | 评估指标值/% | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| AP | AP50 | AP75 | APM | APL | AR | ||||||
| 大型网络 | Hourglass[ | Hourglass | 256×192 | 25.1 | 14.3 | 66.9 | — | — | — | — | — | 
| CPN[ | ResNet-50 | 256×192 | 27.0 | 6.2 | 68.6 | — | — | — | — | — | |
| SimpleBaseline[ | ResNet-50 | 256×192 | 34.0 | 8.9 | 70.4 | 88.6 | 78.3 | 67.1 | 77.2 | 76.3 | |
| HRNetV1[ | HRNetV1-W32 | 256×192 | 28.5 | 7.1 | 73.4 | 89.5 | 80.7 | 70.2 | 80.1 | 78.9 | |
| DARK[ | HRNetV1-W48 | 128×96 | 63.6 | 3.6 | 71.9 | 89.1 | 79.6 | 69.2 | 78.0 | 77.9 | |
| 小型网络 | MobileNetV2[ | MobileNetV2 | 256×192 | 9.6 | 1.4 | 64.6 | 87.4 | 72.3 | 61.1 | 71.2 | 70.7 | 
| MobileNetV2 | MobileNetV2 | 384×288 | 9.6 | 3.3 | 67.3 | 87.9 | 74.3 | 62.8 | 74.7 | 72.9 | |
| ShuffleNetV2[ | ShuffleNetV2 | 256×192 | 7.6 | 1.2 | 59.9 | 85.4 | 66.3 | 56.6 | 66.2 | 66.4 | |
| ShuffleNetV2 | ShuffleNetV2 | 384×288 | 7.6 | 2.8 | 63.6 | 86.5 | 70.5 | 59.5 | 70.7 | 69.7 | |
| Small HRNet[ | HRNet-W16 | 256×192 | 1.3 | 0.5 | 55.2 | 83.7 | 62.4 | 52.3 | 61.0 | 62.1 | |
| Small HRNet | HRNet-W16 | 384×288 | 1.3 | 1.2 | 56.0 | 83.8 | 63.0 | 52.4 | 62.6 | 62.6 | |
| DY-MobileNetV2[ | DY-MobileNetV2 | 256×192 | 16.1 | 1.0 | 68.2 | 88.4 | 76.0 | 65.0 | 74.7 | 74.2 | |
| DY-RELU[ | MobileNetV2 | 256×192 | 9.0 | 1.0 | 68.1 | 88.5 | 76.2 | 64.8 | 74.3 | — | |
| Lite-HRNet[ | Lite-HRNet-18 | 256×192 | 1.1 | 0.2 | 64.8 | 86.7 | 73.0 | 62.1 | 70.5 | 71.2 | |
| Lite-HRNet | Lite-HRNet-18 | 384×288 | 1.1 | 0.4 | 67.6 | 87.8 | 75.0 | 64.5 | 73.7 | 73.7 | |
| Lite-HRNet | Lite-HRNet-30 | 256×192 | 1.8 | 0.3 | 67.2 | 88.0 | 75.0 | 64.3 | 73.1 | 73.3 | |
| Lite-HRNet | Lite-HRNet-30 | 384×288 | 1.8 | 0.7 | 70.4 | 88.7 | 77.7 | 67.5 | 76.3 | 76.2 | |
| DGLNet | DGLNet-18 | 256×192 | 1.1 | 0.2 | 66.1 | 89.4 | 73.2 | 64.0 | 71.9 | 71.8 | |
| DGLNet-18 | 384×288 | 1.1 | 0.4 | 68.5 | 89.5 | 76.0 | 65.9 | 73.9 | 74.1 | ||
| DGLNet-30 | 256×192 | 1.8 | 0.3 | 68.4 | 89.7 | 76.1 | 65.9 | 74.2 | 73.8 | ||
| DGLNet-30 | 384×288 | 1.8 | 0.7 | 71.9 | 89.9 | 78.2 | 68.8 | 77.3 | 76.9 | ||
| 模型 | 骨干网络 | 输入尺寸 | 参数量/106 | 计算量/GFLOPs | 评估指标值/% | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| AP | AP50 | AP75 | APM | APL | AR | ||||||
| 大型网络 | SimpleBaseline[ | ResNet-50 | 256×192 | 34.0 | 8.9 | 70.0 | 90.9 | 77.9 | 66.8 | 75.8 | 75.6 | 
| CPN[ | ResNet-Inception | 384×288 | — | — | 72.1 | 91.4 | 80.0 | 68.7 | 77.2 | 78.5 | |
| HRNetV1[ | HRNetV1-W32 | 384×288 | 28.5 | 16.0 | 74.9 | 92.5 | 82.8 | 71.3 | 80.9 | 80.1 | |
| DARK[ | HRNetV1-W48 | 384×288 | 63.6 | 32.9 | 76.2 | 92.5 | 83.6 | 72.5 | 82.4 | 81.1 | |
| 小型网络 | MobileNetV2[ | MobileNetV2 | 384×288 | 9.8 | 3.3 | 66.8 | 90.0 | 74.0 | 62.6 | 73.3 | 72.3 | 
| ShuffleNetV2[ | ShuffleNetV2 | 384×288 | 7.6 | 2.8 | 62.9 | 88.5 | 69.4 | 58.9 | 69.3 | 68.9 | |
| Small HRNet[ | HRNet-W16 | 384×288 | 1.3 | 1.2 | 55.2 | 85.8 | 61.4 | 51.7 | 61.2 | 61.5 | |
| Lite-HRNet[ | Lite-HRNet-18 | 384×288 | 1.1 | 0.4 | 66.9 | 89.4 | 74.4 | 64.0 | 72.2 | 72.6 | |
| Lite-HRNet | Lite-HRNet-30 | 384×288 | 1.8 | 0.7 | 69.7 | 90.7 | 77.5 | 66.9 | 75.0 | 75.4 | |
| DGLNet | DGLNet-18 | 384×288 | 1.1 | 0.4 | 68.6 | 90.1 | 75.7 | 65.3 | 74.0 | 74.4 | |
| DGLNet-30 | 384×288 | 1.8 | 0.7 | 71.0 | 90.9 | 77.9 | 67.3 | 76.5 | 76.7 | ||
Tab. 3 Performance comparison on COCO test set
| 模型 | 骨干网络 | 输入尺寸 | 参数量/106 | 计算量/GFLOPs | 评估指标值/% | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| AP | AP50 | AP75 | APM | APL | AR | ||||||
| 大型网络 | SimpleBaseline[ | ResNet-50 | 256×192 | 34.0 | 8.9 | 70.0 | 90.9 | 77.9 | 66.8 | 75.8 | 75.6 | 
| CPN[ | ResNet-Inception | 384×288 | — | — | 72.1 | 91.4 | 80.0 | 68.7 | 77.2 | 78.5 | |
| HRNetV1[ | HRNetV1-W32 | 384×288 | 28.5 | 16.0 | 74.9 | 92.5 | 82.8 | 71.3 | 80.9 | 80.1 | |
| DARK[ | HRNetV1-W48 | 384×288 | 63.6 | 32.9 | 76.2 | 92.5 | 83.6 | 72.5 | 82.4 | 81.1 | |
| 小型网络 | MobileNetV2[ | MobileNetV2 | 384×288 | 9.8 | 3.3 | 66.8 | 90.0 | 74.0 | 62.6 | 73.3 | 72.3 | 
| ShuffleNetV2[ | ShuffleNetV2 | 384×288 | 7.6 | 2.8 | 62.9 | 88.5 | 69.4 | 58.9 | 69.3 | 68.9 | |
| Small HRNet[ | HRNet-W16 | 384×288 | 1.3 | 1.2 | 55.2 | 85.8 | 61.4 | 51.7 | 61.2 | 61.5 | |
| Lite-HRNet[ | Lite-HRNet-18 | 384×288 | 1.1 | 0.4 | 66.9 | 89.4 | 74.4 | 64.0 | 72.2 | 72.6 | |
| Lite-HRNet | Lite-HRNet-30 | 384×288 | 1.8 | 0.7 | 69.7 | 90.7 | 77.5 | 66.9 | 75.0 | 75.4 | |
| DGLNet | DGLNet-18 | 384×288 | 1.1 | 0.4 | 68.6 | 90.1 | 75.7 | 65.3 | 74.0 | 74.4 | |
| DGLNet-30 | 384×288 | 1.8 | 0.7 | 71.0 | 90.9 | 77.9 | 67.3 | 76.5 | 76.7 | ||
| 模型 | 参数量/106 | 计算量/GFLOPs | PCKh@0.5/% | 
|---|---|---|---|
| MobileNetv2[ | 9.6 | 1.9 | 85.4 | 
| MobileNetv3[ | 8.7 | 1.8 | 84.3 | 
| ShuffleNetv2[ | 7.6 | 1.7 | 82.8 | 
| Small HRNet-W16 | 1.3 | 0.7 | 80.2 | 
| Lite-HRNet-18[ | 1.1 | 0.2 | 86.1 | 
| Lite-HRNet-30 | 1.8 | 0.4 | 87.0 | 
| DGLNet-18(本文模型) | 1.1 | 0.2 | 86.8 | 
| DGLNet-30(本文模型) | 1.8 | 0.4 | 87.7 | 
Tab. 4 Performance comparison on MPII validation set (PCKh@0.5)
| 模型 | 参数量/106 | 计算量/GFLOPs | PCKh@0.5/% | 
|---|---|---|---|
| MobileNetv2[ | 9.6 | 1.9 | 85.4 | 
| MobileNetv3[ | 8.7 | 1.8 | 84.3 | 
| ShuffleNetv2[ | 7.6 | 1.7 | 82.8 | 
| Small HRNet-W16 | 1.3 | 0.7 | 80.2 | 
| Lite-HRNet-18[ | 1.1 | 0.2 | 86.1 | 
| Lite-HRNet-30 | 1.8 | 0.4 | 87.0 | 
| DGLNet-18(本文模型) | 1.1 | 0.2 | 86.8 | 
| DGLNet-30(本文模型) | 1.8 | 0.4 | 87.7 | 
| 模型 | 参数量/106 | 计算量/GFLOPs | AP | 
|---|---|---|---|
| Small HRNet | 1.30 | 0.50 | 55.2 | 
| Small HRNet+DFDbottleneck | 1.34 | 0.51 | 59.8 | 
| Small HRNet+DGBblock | 1.13 | 0.34 | 61.7 | 
| Small HRNet+GSCtransition | 1.21 | 0.39 | 60.2 | 
| DGLNet-18 | 1.10 | 0.21 | 66.3 | 
Tab. 5 Ablation experimental results of network lightweight module
| 模型 | 参数量/106 | 计算量/GFLOPs | AP | 
|---|---|---|---|
| Small HRNet | 1.30 | 0.50 | 55.2 | 
| Small HRNet+DFDbottleneck | 1.34 | 0.51 | 59.8 | 
| Small HRNet+DGBblock | 1.13 | 0.34 | 61.7 | 
| Small HRNet+GSCtransition | 1.21 | 0.39 | 60.2 | 
| DGLNet-18 | 1.10 | 0.21 | 66.3 | 
| 模型 | 参数量/106 | 计算量/GFLOPs | AP/% | 
|---|---|---|---|
| Small HRNet | 1.30 | 0.50 | 55.2 | 
| Small HRNet+DFDbottleneck(有DFC) | 1.34 | 0.51 | 59.8 | 
| Small HRNet+DFDbottleneck(无DFC) | 1.29 | 0.49 | 59.3 | 
| Small HRNet+DGBblock(有DFC) | 1.13 | 0.34 | 61.7 | 
| Small HRNet+DGBblock(无DFC) | 1.01 | 0.25 | 60.9 | 
| Small HRNet+GSCtransition(有DFC) | 1.21 | 0.39 | 60.2 | 
| Small HRNet+GSCtransition(无DFC) | 1.12 | 0.35 | 59.5 | 
Tab. 6 Ablation experimental results of decoupled attention module
| 模型 | 参数量/106 | 计算量/GFLOPs | AP/% | 
|---|---|---|---|
| Small HRNet | 1.30 | 0.50 | 55.2 | 
| Small HRNet+DFDbottleneck(有DFC) | 1.34 | 0.51 | 59.8 | 
| Small HRNet+DFDbottleneck(无DFC) | 1.29 | 0.49 | 59.3 | 
| Small HRNet+DGBblock(有DFC) | 1.13 | 0.34 | 61.7 | 
| Small HRNet+DGBblock(无DFC) | 1.01 | 0.25 | 60.9 | 
| Small HRNet+GSCtransition(有DFC) | 1.21 | 0.39 | 60.2 | 
| Small HRNet+GSCtransition(无DFC) | 1.12 | 0.35 | 59.5 | 
| 1 | ZHENG C, WU W, CHEN C, et al. Deep learning-based human pose estimation: a survey [J]. ACM Computing Surveys, 2023, 56(1): No.11. | 
| 2 | XIAO B, WU H, WEI Y. Simple baselines for human pose estimation and tracking [C]// Proceedings of the 2018 European Conference on Computer Vision, LNCS 11210. Cham: Springer, 2018: 472-487. | 
| 3 | WEI S E, RAMAKRISHNA V, KANADE T, et al. Convolutional pose machines [C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 4724-4732. | 
| 4 | NEWELL A, YANG K, DENG J. Stacked hourglass networks for human pose estimation [C]// Proceedings of the 2016 European Conference on Computer Vision, LNCS 9912. Cham: Springer, 2016: 483-499. | 
| 5 | CHU X, YANG W, OUYANG W, et al. Multi-context attention for human pose estimation [C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 5669-5678. | 
| 6 | YANG W, LI S, OUYANG W, et al. Learning feature pyramids for human pose estimation [C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 1290-1299. | 
| 7 | SUN K, XIAO B, LIU D, et al. Deep high-resolution representation learning for human pose estimation [C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 5686-5696. | 
| 8 | HOWARD A G, ZHU M, CHEN B, et al. MobileNets: efficient convolutional neural networks for mobile vision applications [EB/OL]. [2024-02-08]. . | 
| 9 | SANDLER M, HOWARD A, ZHU M, et al. MobileNetV2: inverted residuals and linear bottlenecks [C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 4510-4520. | 
| 10 | HOWARD A, SANDLER M, CHEN B, et al. Searching for MobileNetV3 [C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2019: 1314-1324. | 
| 11 | ZHANG X, ZHOU X, LIN M, et al. ShuffleNet: an extremely efficient convolutional neural network for mobile devices [C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 6848-6856. | 
| 12 | MA N, ZHANG X, ZHENG H T, et al. ShuffleNet V2: practical guidelines for efficient CNN architecture design [C]// Proceedings of the 2018 European Conference on Computer Vision, LNCS 11218. Cham: Springer, 2018: 122-138. | 
| 13 | TAN M, LE Q V. EfficientNet: rethinking model scaling for convolutional neural networks [C]// Proceedings of the 36th International Conference on Machine Learning. New York: JMLR.org, 2019: 6105-6114. | 
| 14 | TAN M, LE Q V. EfficientNetV2: smaller models and faster training [C]// Proceedings of the 38th International Conference on Machine Learning. New York: JMLR.org, 2021: 10096-10106. | 
| 15 | CUI C, GAO T, WEI S, et al. PP-LCNet: a lightweight CPU convolutional neural network [EB/OL]. [2023-10-08]. . | 
| 16 | TANG Y, HAN K, GUO J, et al. GhostNetV2: enhance cheap operation with long-range attention [C]// Proceedings of the 36th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2022: 9969-9982. | 
| 17 | HAN K, WANG Y, TIAN Q, et al. GhostNet: more features from cheap operations [C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 1577-1586. | 
| 18 | WANG J, SUN K, CHENG T, et al. Deep high-resolution representation learning for visual recognition [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(10): 3349-3364. | 
| 19 | CHEN Y, WANG Z, PENG Y, et al. Cascaded pyramid network for multi-person pose estimation [C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 7103-7112. | 
| 20 | FANG H S, XIE S, TAI Y W, et al. RMPE: regional multi-person pose estimation [C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Piscataway: IEEE, 2017: 2353-2362. | 
| 21 | CAI Y, WANG Z, LUO Z, et al. Learning delicate local representations for multi-person pose estimation [C]// Proceedings of the 2020 European Conference on Computer Vision, LNCS 12348. Cham: Springer, 2020: 455-472. | 
| 22 | WANG J, LONG X, GAO Y, et al. Graph-PCNN: two stage human pose estimation with graph pose refinement [C]// Proceedings of the 2020 European Conference on Computer Vision, LNCS 12356. Cham: Springer, 2020: 492-508. | 
| 23 | PAPANDREOU G, ZHU T, KANAZAWA N, et al. Towards accurate multi-person pose estimation in the wild [C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 3711-3719. | 
| 24 | YU C, XIAO B, GAO C, et al. Lite-HRNet: a lightweight high-resolution network [C]// Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2021: 10435-10445. | 
| 25 | PISHCHULIN L, INSAFUTDINOV E, TANG S, et al. DeepCut: joint subset partition and labeling for multi person pose estimation [C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2016: 4929-4937. | 
| 26 | INSAFUTDINOV E, PISHCHULIN L, ANDRES B, et al. DeeperCut: a deeper, stronger, and faster multi-person pose estimation model [C]// Proceedings of the 2016 European Conference on Computer Vision, LNCS 9910. Cham: Springer, 2016: 34-50. | 
| 27 | CAO Z, SIMON T, WEI S E, et al. Realtime multi-person 2D pose estimation using part affinity fields [C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 1302-1310. | 
| 28 | KREISS S, BERTONI L, ALAHI A. PifPaf: composite fields for human pose estimation [C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 11969-11978. | 
| 29 | NEWELL A, HUANG Z, DENG J. Associative embedding: end-to-end learning for joint detection and grouping [C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2017: 2274-2284. | 
| 30 | CHENG B, XIAO B, WANG J, et al. HigherHRNet: scale-aware representation learning for bottom-up human pose estimation [C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 5385-5394. | 
| 31 | IANDOLA F N, HAN S, MOSKEWICZ M W, et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5 MB model size [EB/OL]. [2023-10-08]. . | 
| 32 | ZHOU D, HOU Q, CHEN Y, et al. Rethinking bottleneck structure for efficient mobile network design [C]// Proceedings of the 2020 European Conference on Computer Vision, LNCS 12348. Cham: Springer, 2020: 680-697. | 
| 33 | DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: Transformers for image recognition at scale [EB/OL]. [2023-11-10]. . | 
| 34 | WANG X, GIRSHICK R, GUPTA A, et al. Non-local neural networks [C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 7794-7803. | 
| 35 | LIU Z, LIN Y, CAO Y, et al. Swin Transformer: hierarchical vision transformer using shifted windows [C]// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE, 2021: 9992-10002. | 
| 36 | MEHTA S, RASTEGARI M. MobileViT: light-weight, general-purpose, and mobile-friendly vision transformer [EB/OL]. [2023-09-08]. . | 
| 37 | LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: common objects in context [C]// Proceedings of the 2014 European Conference on Computer Vision, LNCS 8693. Cham: Springer, 2014: 740-755. | 
| 38 | ZHANG F, ZHU X, DAI H, et al. Distribution-aware coordinate representation for human pose estimation [C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 7091-7100. | 
| 39 | CHEN Y, DAI X, LIU M, et al. Dynamic convolution: attention over convolution kernels [C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 11027-11036. | 
| 40 | CHEN Y, DAI X, LIU M, et al. Dynamic ReLU [C]// Proceedings of the 2020 European Conference on Computer Vision, LNCS 12364. Cham: Springer, 2020: 351-367. | 
| 41 | ANDRILUKA M, PISHCHULIN L, GEHLER P, et al. 2D human pose estimation: new benchmark and state of the art analysis [C]// Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2014: 3686-3693. | 
| [1] | Jialin ZHANG, Qinghua REN, Qirong MAO. Speaker verification system utilizing global-local feature dependency for anti-spoofing [J]. Journal of Computer Applications, 2025, 45(1): 308-317. | 
| [2] | Lifang WANG, Jingshuang WU, Pengliang YIN, Lihua HU. Action recognition algorithm based on attention mechanism and energy function [J]. Journal of Computer Applications, 2025, 45(1): 234-239. | 
| [3] | Ying HUANG, Changsheng LI, Hui PENG, Su LIU. Dual-branch network guided by local entropy for dynamic scene high dynamic range imaging [J]. Journal of Computer Applications, 2025, 45(1): 204-213. | 
| [4] | Jie XU, Yong ZHONG, Yang WANG, Changfu ZHANG, Guanci YANG. Facial attribute estimation and expression recognition based on contextual channel attention mechanism [J]. Journal of Computer Applications, 2025, 45(1): 253-260. | 
| [5] | Jing QIN, Zhiguang QIN, Fali LI, Yueheng PENG. Diagnosis of major depressive disorder based on probabilistic sparse self-attention neural network [J]. Journal of Computer Applications, 2024, 44(9): 2970-2974. | 
| [6] | Liting LI, Bei HUA, Ruozhou HE, Kuang XU. Multivariate time series prediction model based on decoupled attention mechanism [J]. Journal of Computer Applications, 2024, 44(9): 2732-2738. | 
| [7] | Zhiqiang ZHAO, Peihong MA, Xinhong HEI. Crowd counting method based on dual attention mechanism [J]. Journal of Computer Applications, 2024, 44(9): 2886-2892. | 
| [8] | Kaipeng XUE, Tao XU, Chunjie LIAO. Multimodal sentiment analysis network with self-supervision and multi-layer cross attention [J]. Journal of Computer Applications, 2024, 44(8): 2387-2392. | 
| [9] | Pengqi GAO, Heming HUANG, Yonghong FAN. Fusion of coordinate and multi-head attention mechanisms for interactive speech emotion recognition [J]. Journal of Computer Applications, 2024, 44(8): 2400-2406. | 
| [10] | Zhonghua LI, Yunqi BAI, Xuejin WANG, Leilei HUANG, Chujun LIN, Shiyu LIAO. Low illumination face detection based on image enhancement [J]. Journal of Computer Applications, 2024, 44(8): 2588-2594. | 
| [11] | Shangbin MO, Wenjun WANG, Ling DONG, Shengxiang GAO, Zhengtao YU. Single-channel speech enhancement based on multi-channel information aggregation and collaborative decoding [J]. Journal of Computer Applications, 2024, 44(8): 2611-2617. | 
| [12] | Li LIU, Haijin HOU, Anhong WANG, Tao ZHANG. Generative data hiding algorithm based on multi-scale attention [J]. Journal of Computer Applications, 2024, 44(7): 2102-2109. | 
| [13] | Song XU, Wenbo ZHANG, Yifan WANG. Lightweight video salient object detection network based on spatiotemporal information [J]. Journal of Computer Applications, 2024, 44(7): 2192-2199. | 
| [14] | Dahai LI, Zhonghua WANG, Zhendong WANG. Dual-branch low-light image enhancement network combining spatial and frequency domain information [J]. Journal of Computer Applications, 2024, 44(7): 2175-2182. | 
| [15] | Wenliang WEI, Yangping WANG, Biao YUE, Anzheng WANG, Zhe ZHANG. Deep learning model for infrared and visible image fusion based on illumination weight allocation and attention [J]. Journal of Computer Applications, 2024, 44(7): 2183-2191. | 
| Viewed | ||||||
| Full text |  | |||||
| Abstract |  | |||||