Journal of Computer Applications

    Next Articles

Improved lightweight and high-precision pose detection network HG-YOLO

  

  • Received:2024-12-27 Revised:2025-04-10 Online:2025-04-28 Published:2025-04-28

轻量且高精度的姿态检测改进网络HG-YOLO

崔家礼1,刘永基1,李子贺1,郑瀚2   

  1. 1. 北方工业大学
    2. 广西高校人工智能与信息处理重点实验室
  • 通讯作者: 郑瀚
  • 基金资助:
    广西高校人工智能与信息处理重点实验室(河池学院)开放研究基金

Abstract: In the human posture detection task, the existing deep learning networks have the problems of insufficient detection precision, complex network parameters and high computational cost, which seriously limit their applications. In order to solve these problems, lightweight and highly precise improved network HG-YOLO(High-accuracy and Ghost YOLO) wasis proposed in this paper. To address the problem of insufficient detection precision, the Transformer-based detection network RT-DETR(Real-Time Detection Transformer) was integrated into the backbone part of the HG-YOLO network, and the Large Separable Kernel Attention (LSKA) module was embedded into the backbone network, which improved the feature extraction ability of the network to cope with the complex scenarios without increasing the memory occupation and computational complexity, thus improving the human body posture detection precision. To address the problem of complex network parameters and high computational cost, HG-YOLO introduced lightweight Ghost convolution module to replace some of the standard convolutions, and furthermore, a shared convolution detection head was designed in the detection head part of the HG-YOLO network, which reduced the convolution computation through the parameter and weight sharing mechanism, thus reducing the number of parameters and computational complexity of the network. Experimental results on the COCO 2017-Keypoints dataset and the CrowdPose dataset show that compared to the benchmark YOLOv8-Pose network, HG-YOLO reduces the amount of parameters by 32% and the amount of floating-point operations by 18%, and on the COCO 2017-Keypoints dataset, the AP50(Average Precision at OKS=0.50) improves by 1.8 percentage points, and on the CrowdPose dataset, the AP improves by 2.9 percentage points. It proves that the HG-YOLO network proposed in this paper is not only lightweight but also has high detection precision, which is an excellent network model in the field of human posture detection.

Key words: pose estimation, HG-YOLO, Large Separable Kernel Attention(LSKA), shared convolutional detection head, lightweight network.

摘要: 人体姿态检测任务中,现有的深度学习网络存在检测准确性不足、网络参数复杂、计算成本高等问题,严重限制了其应用。为解决这些问题,本文提出了一种轻量且高精度的改进网络HG-YOLO(High-accuracy and Ghost YOLO)。针对检测准确性不足的问题,在HG-YOLO网络主干部分,融合了基于Transformer的检测网络RT-DETR(Real-Time Detection Transformer),并将大型可分离核注意力(LSKA)模块嵌入到主干网络中,在不增加内存占用和计算复杂性的基础上,提高网络应对复杂场景的特征提取能力,从而提高人体姿态的检测准确性。针对网络参数复杂和计算成本高问题,HG-YOLO引入轻量化的Ghost卷积模块来替换部分标准卷积,进一步的,在HG-YOLO网络的检测头部分,设计了一种共享卷积检测头,通过参数和权重共享机制,减少了卷积计算,从而降低了网络的参数量和计算复杂度。在COCO 2017-Keypoints数据集和CrowdPose数据集上的实验结果表明,与基准的YOLOv8-Pose网络相比,HG-YOLO的参数量减少了32%,浮点运算量减少了18%,在COCO 2017-Keypoints数据集上,AP50(Average Precision at OKS=0.50)提升了1.8个百分点,在CrowdPose数据集上,AP提升了2.9个百分点。证明本文提出的HG-YOLO网络不仅轻量,而且检测准确性高,是人体姿态检测领域的优秀网络模型。

关键词: 姿态检测, HG-YOLO, 大型可分离核注意力, 共享卷积检测头, 轻量化网络。

CLC Number: