Journal of Computer Applications ›› 2025, Vol. 45 ›› Issue (12): 4004-4011.DOI: 10.11772/j.issn.1001-9081.2024121819
• Multimedia computing and computer simulation • Previous Articles Next Articles
Jiali CUI1, Yongji LIU1, Zihe LI1, Han ZHENG1,2
Received:2024-12-27
Revised:2025-04-09
Accepted:2025-04-14
Online:2025-04-28
Published:2025-12-10
Contact:
Han ZHENG
About author:CUI Jiali, born in 1975, Ph. D., associate research fellow. His research interests include image processing, pattern recognition.Supported by:崔家礼1, 刘永基1, 李子贺1, 郑瀚1,2
通讯作者:
郑瀚
作者简介:崔家礼(1975—),男,山东枣庄人,副研究员,博士,CCF会员,主要研究方向:图像处理、模式识别基金资助:CLC Number:
Jiali CUI, Yongji LIU, Zihe LI, Han ZHENG. HG-YOLO: lightweight and high-precision enhancement pose detection network[J]. Journal of Computer Applications, 2025, 45(12): 4004-4011.
崔家礼, 刘永基, 李子贺, 郑瀚. 轻量且高精度增强的姿态检测网络HG-YOLO[J]. 《计算机应用》唯一官方网站, 2025, 45(12): 4004-4011.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.joca.cn/EN/10.11772/j.issn.1001-9081.2024121819
| 网络 | 参数量/106 | AP50/% | AP75/% | APM/% | APL/% | AR/% |
|---|---|---|---|---|---|---|
| DirectPose[ | — | 86.4 | 68.2 | 56.7 | 69.8 | — |
| OpenPose[ | — | 84.9 | 67.5 | 57.1 | 68.2 | 66.5 |
| HRNet[ | 28.5 | 86.3 | 70.4 | 59.6 | 73.9 | — |
| KAPAO-s[ | 12.6 | 88.4 | 70.4 | 58.6 | 71.7 | 71.2 |
| YOLOv5-s | 10.8 | 87.8 | 69.5 | 57.6 | 72.6 | 67.6 |
| RTMO-s[ | 9.9 | 88.8 | 73.6 | 61.1 | 75.7 | 70.9 |
| YOLOv8-s | 11.6 | 86.2 | 65.9 | 53.9 | 70.3 | 66.5 |
| HG-YOLO-s | 7.9 | 87.0 | 67.2 | 55.7 | 70.9 | 67.4 |
| KAPAO-m | 35.8 | 90.5 | 76.5 | 64.3 | 76.0 | 76.3 |
| YOLOv5-m | 29.3 | 90.7 | 75.8 | 63.4 | 77.1 | 72.8 |
| RTMO-m | 22.6 | 90.6 | 77.1 | 65.1 | 78.1 | 74.2 |
| YOLOv8-m | 26.4 | 88.8 | 72.3 | 60.5 | 73.6 | 71.8 |
| HG-YOLO-m | 19.6 | 90.5 | 73.7 | 61.7 | 74.7 | 72.4 |
Tab. 1 Comparison of experimental results on COCO2017-Keypoints dataset
| 网络 | 参数量/106 | AP50/% | AP75/% | APM/% | APL/% | AR/% |
|---|---|---|---|---|---|---|
| DirectPose[ | — | 86.4 | 68.2 | 56.7 | 69.8 | — |
| OpenPose[ | — | 84.9 | 67.5 | 57.1 | 68.2 | 66.5 |
| HRNet[ | 28.5 | 86.3 | 70.4 | 59.6 | 73.9 | — |
| KAPAO-s[ | 12.6 | 88.4 | 70.4 | 58.6 | 71.7 | 71.2 |
| YOLOv5-s | 10.8 | 87.8 | 69.5 | 57.6 | 72.6 | 67.6 |
| RTMO-s[ | 9.9 | 88.8 | 73.6 | 61.1 | 75.7 | 70.9 |
| YOLOv8-s | 11.6 | 86.2 | 65.9 | 53.9 | 70.3 | 66.5 |
| HG-YOLO-s | 7.9 | 87.0 | 67.2 | 55.7 | 70.9 | 67.4 |
| KAPAO-m | 35.8 | 90.5 | 76.5 | 64.3 | 76.0 | 76.3 |
| YOLOv5-m | 29.3 | 90.7 | 75.8 | 63.4 | 77.1 | 72.8 |
| RTMO-m | 22.6 | 90.6 | 77.1 | 65.1 | 78.1 | 74.2 |
| YOLOv8-m | 26.4 | 88.8 | 72.3 | 60.5 | 73.6 | 71.8 |
| HG-YOLO-m | 19.6 | 90.5 | 73.7 | 61.7 | 74.7 | 72.4 |
| 网络 | 参数量/106 | AP/% | APE/% | APM/% | APH/% |
|---|---|---|---|---|---|
| OpenPose | — | — | 62.7 | 48.7 | 32.3 |
| HRNet | 28.5 | 71.3 | 80.5 | 71.4 | 62.5 |
| KAPAO-l | 77.0 | 68.9 | 76.6 | 69.9 | 59.5 |
| RTMO-s | 9.9 | 67.3 | 73.7 | 68.2 | 59.1 |
| YOLOv8-s | 11.6 | 62.1 | 69.6 | 63.4 | 58.3 |
| HG‑YOLO‑s | 7.9 | 64.7 | 70.8 | 65.7 | 62.5 |
| RTMO-m | 22.6 | 71.1 | 77.4 | 71.9 | 63.4 |
| YOLOv8-m | 26.4 | 64.9 | 73.5 | 64.4 | 59.3 |
| HG‑YOLO‑m | 19.6 | 67.8 | 75.2 | 66.8 | 63.6 |
Tab. 2 Comparison of experimental data on CrowdPose dataset
| 网络 | 参数量/106 | AP/% | APE/% | APM/% | APH/% |
|---|---|---|---|---|---|
| OpenPose | — | — | 62.7 | 48.7 | 32.3 |
| HRNet | 28.5 | 71.3 | 80.5 | 71.4 | 62.5 |
| KAPAO-l | 77.0 | 68.9 | 76.6 | 69.9 | 59.5 |
| RTMO-s | 9.9 | 67.3 | 73.7 | 68.2 | 59.1 |
| YOLOv8-s | 11.6 | 62.1 | 69.6 | 63.4 | 58.3 |
| HG‑YOLO‑s | 7.9 | 64.7 | 70.8 | 65.7 | 62.5 |
| RTMO-m | 22.6 | 71.1 | 77.4 | 71.9 | 63.4 |
| YOLOv8-m | 26.4 | 64.9 | 73.5 | 64.4 | 59.3 |
| HG‑YOLO‑m | 19.6 | 67.8 | 75.2 | 66.8 | 63.6 |
| 网络 | 参数量/106 | AP50/% | AP75/% | GFLOPs/109 |
|---|---|---|---|---|
| HG‑YOLO‑s | 7.9 | 87.0 | 67.2 | 24.7 |
| GDE-POSE[ | 8.8 | 77.3 | — | 27.5 |
| HG‑YOLO‑m | 19.6 | 90.5 | 73.7 | 76.2 |
| YOLO-POSE[ | 22.3 | 89.1 | 69.5 | — |
| KSL-POSE[ | — | 89.6 | 58.9 | 117.3 |
Tab. 3 Performance comparison of improved YOLO human pose detection models
| 网络 | 参数量/106 | AP50/% | AP75/% | GFLOPs/109 |
|---|---|---|---|---|
| HG‑YOLO‑s | 7.9 | 87.0 | 67.2 | 24.7 |
| GDE-POSE[ | 8.8 | 77.3 | — | 27.5 |
| HG‑YOLO‑m | 19.6 | 90.5 | 73.7 | 76.2 |
| YOLO-POSE[ | 22.3 | 89.1 | 69.5 | — |
| KSL-POSE[ | — | 89.6 | 58.9 | 117.3 |
| 组序 | HG-Ghost | LSKA | LSCD | 参数量/106 | GFLOPs/ 109 | AP(50:95)/ % |
|---|---|---|---|---|---|---|
| 1 | — | — | — | 11.6 | 30.2 | 56.6 |
| 2 | √ | — | — | 9.5 | 27.8 | 57.2 |
| 3 | — | √ | — | 11.9 | 30.9 | 57.8 |
| 4 | — | — | √ | 9.5 | 26.4 | 55.4 |
| 5 | √ | √ | — | 9.9 | 28.5 | 58.9 |
| 6 | — | √ | √ | 9.9 | 27.1 | 56.3 |
| 7 | √ | — | √ | 7.5 | 24.0 | 55.4 |
| 8 | √ | √ | √ | 7.9 | 24.7 | 58.4 |
Tab. 4 Ablation study results
| 组序 | HG-Ghost | LSKA | LSCD | 参数量/106 | GFLOPs/ 109 | AP(50:95)/ % |
|---|---|---|---|---|---|---|
| 1 | — | — | — | 11.6 | 30.2 | 56.6 |
| 2 | √ | — | — | 9.5 | 27.8 | 57.2 |
| 3 | — | √ | — | 11.9 | 30.9 | 57.8 |
| 4 | — | — | √ | 9.5 | 26.4 | 55.4 |
| 5 | √ | √ | — | 9.9 | 28.5 | 58.9 |
| 6 | — | √ | √ | 9.9 | 27.1 | 56.3 |
| 7 | √ | — | √ | 7.5 | 24.0 | 55.4 |
| 8 | √ | √ | √ | 7.9 | 24.7 | 58.4 |
| [1] | HUNG J M, CHIANG J Y, WANG K. Tennis player pose classification using YOLO and MLP neural networks[C]// Proceedings of the 2021 International Symposium on Intelligent Signal Processing and Communication Systems. Piscataway: IEEE, 2021: 1-2. |
| [2] | 唐鲁婷,黄洪琼. 基于YOLOv7的轻量化水下目标检测算法[J]. 电光与控制, 2024, 31(9): 92-97. |
| TANG L T, HUANG H Q. A YOLOv7 based lightweight underwater target detection algorithm[J]. Electronics Optics and Control, 2024, 31(9): 92-97. | |
| [3] | 杨锦辉,李鸿,杜芸彦,等. 基于改进YOLOv5s的轻量化目标检测算法[J]. 电光与控制, 2023, 30(2): 24-30. |
| YANG J H, LI H, DU Y Y, et al. A lightweight object detection algorithm based on improved YOLOv5s[J]. Electronics Optics and Control, 2023, 30(2): 24-30. | |
| [4] | TERVEN J, CÓRDOVA-ESPARZA D M, ROMERO-GONZÁLEZ J A. A comprehensive review of YOLO: from YOLOv1 to YOLOv8 and YOLO-NAS[J]. Machine Learning and Knowledge Extraction, 2023, 5(4): 1680-1716. |
| [5] | NEWELL A, YANG K, DENG J. Stacked hourglass networks for human pose estimation[C]// Proceedings of the 2016 European Conference on Computer Vision, LNCS 9912. Cham: Springer, 2016: 483-499. |
| [6] | CHEN Y, WANG Z, PENG Y, et al. Cascaded pyramid network for multi-person pose estimation[C]// Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 7103-7112. |
| [7] | CAO Z, SIMON T, WEI S E, et al. Realtime multi-person 2D pose estimation using part affinity fields[C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2017: 1302-1310. |
| [8] | CHENG B, XIAO B, WANG J, et al. HigherHRNet: scale-aware representation learning for bottom-up human pose estimation[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 5385-5394. |
| [9] | DONG C, TANG Y, ZHANG L. HDA-Pose: a real-time 2D human pose estimation method based on modified YOLOv8[J]. Signal, Image and Video Processing, 2024, 18(8/9): 5823-5839. |
| [10] | ZHENG B, ZHANG H, JIN L. Research on multi-person pose estimation based on YOLO and decoupled multi-level feature layers fusion[C]// Proceedings of the 5th ACM International Conference on Multimedia in Asia. New York: ACM, 2023: No.58. |
| [11] | MOU F, REN H, WANG B, et al. Pose estimation and robotic insertion tasks based on YOLO and layout features[J]. Engineering Applications of Artificial Intelligence, 2022, 114: No.105164. |
| [12] | ZHANG Z, LU X, CAO G, et al. ViT-YOLO: Transformer-based YOLO for object detection[C]// Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision Workshops. Piscataway: IEEE, 2021: 2799-2808. |
| [13] | ZHANG F, CAO W, WANG S, et al. Improved YOLOv4 recognition algorithm for pitaya based on coordinate attention and combinational convolution[J]. Frontiers in Plant Science, 2022, 13: No.1030021. |
| [14] | WU J, DONG J, NIE W, et al. A lightweight YOLOv5 optimization of coordinate attention[J]. Applied Sciences, 2023, 13(3): No.1746. |
| [15] | GONG C, ZHANG Y, WEI Y, et al. Multicow pose estimation based on keypoint extraction[J]. PLoS ONE, 2022, 17(6): No.e0269259. |
| [16] | LIU J, CAI Q, ZOU F, et al. BiGA-YOLO: a lightweight object detection network based on YOLOv5 for autonomous driving[J]. Electronics, 2023, 12(12): No.2745. |
| [17] | JIANG Y, YANG K, ZHU J, et al. YOLO-RlePose: improved YOLO based on Swin Transformer and RLE-OKS loss for multi-person pose estimation[J]. Electronics, 2024, 13(3): No.563. |
| [18] | MAJI D, NAGORI S, MATHEW M, et al. YOLO-Pose: enhancing YOLO for multi-person pose estimation using object keypoint similarity loss[C]// Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Piscataway: IEEE, 2022: 2636-2645. |
| [19] | DAI Y, LIU W. GL-YOLO-Lite: a novel lightweight fallen person detection model[J]. Entropy, 2023, 25(4): No.587. |
| [20] | LI X, GUO Y, PAN W, et al. Human pose estimation based on lightweight multi-scale coordinate attention [J]. Applied Sciences, 2023, 13(6): No.3614. |
| [21] | ZHANG Y, WANG Z, LI M, et al. SP-YOLO: an end-to-end lightweight network for real-time human pose estimation[J]. Signal, Image and Video Processing, 2024, 18(1): 863-876. |
| [22] | DONG C, DU G. An enhanced real-time human pose estimation method based on modified YOLOv8 framework[J]. Scientific Reports, 2024, 14: No.8012. |
| [23] | ZHAO Y, LV W, XU S, et al. DETRs beat YOLOs on real-time object detection[C]// Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2024: 16965-16974. |
| [24] | HAN K, WANG Y, TIAN Q, et al. GhostNet: more features from cheap operations[C]// Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2020: 1577-1586. |
| [25] | WANG A, CHEN H, LIU L, et al. YOLOv10: real-time end-to-end object detection[C]// Proceedings of the 38th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2024: 107984-108011. |
| [26] | LAU K W, PO L M, REHMAN Y A UR. Large separable kernel attention: rethinking the large kernel attention design in CNN[J]. Expert Systems with Applications, 2024, 236: No.121352. |
| [27] | GUO M H, LU C Z, LIU Z N, et al. Visual attention network[J]. Computational Visual Media, 2023, 9(4): 733-752. |
| [28] | LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: common objects in context [C]// Proceedings of the 2014 European Conference on Computer Vision, LNCS 8693. Cham: Springer, 2014: 740-755. |
| [29] | LI J, WANG C, ZHU H, et al. CrowdPose: efficient crowded scenes pose estimation and a new benchmark[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 10863-10872. |
| [30] | TIAN Z, CHEN H, SHEN C. DirectPose: direct end-to-end multi-person pose estimation[EB/OL]. [2024-04-06].. |
| [31] | CAO Z, HIDALGO G, SIMON T, et al. OpenPose: realtime multi-person 2D pose estimation using part affinity fields[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 43(1): 172-186. |
| [32] | SUN K, XIAO B, LIU D, et al. Deep high-resolution representation learning for human pose estimation[C]// Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2019: 5686-5696. |
| [33] | McNALLY W, VATS K, WONG A, et al. Rethinking keypoint representations: modeling keypoints and poses as objects for multi-person human pose estimation[C]// Proceedings of the 2022 European Conference on Computer Vision, LNCS 13666. Cham: Springer, 2022: 37-54. |
| [34] | LU P, JIANG T, LI Y N, et al. RTMO: towards high-performance one-stage real-time multi-person pose estimation [C]// Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2024: 1491-1500. |
| [35] | KUOK K, LIU X, YE J, et al. GDE-pose: a real-time adaptive compression and multi-scale dynamic feature fusion approach for pose estimation[J]. Electronics, 2024, 13(23): No.4847. |
| [36] | DING J, NIU S, NIE Z, et al. Research on human posture estimation algorithm based on YOLO-Pose[J]. Sensors, 2024, 24(10): No.3036. |
| [37] | LU T, CHENG K, HUA X, et al. KSL-POSE: a real-time 2D human pose estimation method based on modified YOLOv8-pose framework[J]. Sensors, 2024, 24(19): No.6249. |
| [1] | Junying CHEN, Shijie GUO, Lingling CHEN. Lightweight human pose estimation based on decoupled attention and ghost convolution [J]. Journal of Computer Applications, 2025, 45(1): 223-233. |
| [2] | Bin XIAO, Yun GAN, Min WANG, Xingpeng ZHANG, Zhaoxing WANG. Network abnormal traffic detection based on port attention and convolutional block attention module [J]. Journal of Computer Applications, 2024, 44(4): 1027-1034. |
| [3] | Hong WANG, Qing QIAN, Huan WANG, Yong LONG. Lightweight image tamper localization algorithm based on large kernel attention convolution [J]. Journal of Computer Applications, 2023, 43(9): 2692-2699. |
| [4] | Yaoshun LI, Lizhi LIU. Lightweight network for rebar detection with attention mechanism [J]. Journal of Computer Applications, 2022, 42(9): 2900-2908. |
| [5] | Zhifeng ZHONG, Yifan XIA, Dongping ZHOU, Yangtian YAN. Lightweight object detection algorithm based on improved YOLOv4 [J]. Journal of Computer Applications, 2022, 42(7): 2201-2209. |
| [6] | Juan WANG, Xuliang YUAN, Minghu WU, Liquan GUO, Zishan LIU. Real-time semantic segmentation method based on squeezing and refining network [J]. Journal of Computer Applications, 2022, 42(7): 1993-2000. |
| [7] | Dan HE, Xiping HE, Yue LI, Rui YUAN, Yuanyuan NIU. Face anti-spoofing method based on regional blocking and lightweight network [J]. Journal of Computer Applications, 2022, 42(12): 3708-3714. |
| [8] | Haiyan SUN, Yunbo CHEN, Dingwei FENG, Tong WANG, Xingquan CAI. Forest pest detection method based on attention model and lightweight YOLOv4 [J]. Journal of Computer Applications, 2022, 42(11): 3580-3587. |
| [9] | SHI Yangxiao, ZHANG Jun, CHEN Peng, WANG Bing. Classification of steel surface defects based on lightweight network [J]. Journal of Computer Applications, 2021, 41(6): 1836-1841. |
| [10] | DENG Xiong, WANG Hongchun. Face liveness detection algorithm based on deep learning and feature fusion [J]. Journal of Computer Applications, 2020, 40(4): 1009-1015. |
| [11] | PI Jiatian, YANG Jiezhi, YANG Linxi, PENG Mingjie, DENG Xiong, ZHAO Lijun, TANG Wanmei, WU Zhiyou. Lightweight face liveness detection method based on multi-modal feature fusion [J]. Journal of Computer Applications, 2020, 40(12): 3658-3665. |
| Viewed | ||||||
|
Full text |
|
|||||
|
Abstract |
|
|||||