Table of Contents
Fetching ...

GatedUniPose: A Novel Approach for Pose Estimation Combining UniRepLKNet and Gated Convolution

Liang Feng, Ming Xu, Lihua Wen, Zhixuan Shen

TL;DR

GatedUniPose tackles robust human pose estimation in complex and occluded scenes by integrating UniRepLKNet with Gated Convolution and embedding enhancements via the GLACE module, complemented by DySample upsampling for improved feature fusion. The approach demonstrates strong accuracy on COCO and MPII with a relatively small parameter footprint, while maintaining competitive performance on CrowdPose. Key contributions include the novel backbone integration of GConv and GLACE, the DySample-based head fusion, and a distillation-based loss strategy, all contributing to improved context-aware joint estimation. This work advances efficient, high-accuracy pose estimation suitable for crowded scenarios and real-world deployment, with code and models slated for public release.

Abstract

Pose estimation is a crucial task in computer vision, with wide applications in autonomous driving, human motion capture, and virtual reality. However, existing methods still face challenges in achieving high accuracy, particularly in complex scenes. This paper proposes a novel pose estimation method, GatedUniPose, which combines UniRepLKNet and Gated Convolution and introduces the GLACE module for embedding. Additionally, we enhance the feature map concatenation method in the head layer by using DySample upsampling. Compared to existing methods, GatedUniPose excels in handling complex scenes and occlusion challenges. Experimental results on the COCO, MPII, and CrowdPose datasets demonstrate that GatedUniPose achieves significant performance improvements with a relatively small number of parameters, yielding better or comparable results to models with similar or larger parameter sizes.

GatedUniPose: A Novel Approach for Pose Estimation Combining UniRepLKNet and Gated Convolution

TL;DR

GatedUniPose tackles robust human pose estimation in complex and occluded scenes by integrating UniRepLKNet with Gated Convolution and embedding enhancements via the GLACE module, complemented by DySample upsampling for improved feature fusion. The approach demonstrates strong accuracy on COCO and MPII with a relatively small parameter footprint, while maintaining competitive performance on CrowdPose. Key contributions include the novel backbone integration of GConv and GLACE, the DySample-based head fusion, and a distillation-based loss strategy, all contributing to improved context-aware joint estimation. This work advances efficient, high-accuracy pose estimation suitable for crowded scenarios and real-world deployment, with code and models slated for public release.

Abstract

Pose estimation is a crucial task in computer vision, with wide applications in autonomous driving, human motion capture, and virtual reality. However, existing methods still face challenges in achieving high accuracy, particularly in complex scenes. This paper proposes a novel pose estimation method, GatedUniPose, which combines UniRepLKNet and Gated Convolution and introduces the GLACE module for embedding. Additionally, we enhance the feature map concatenation method in the head layer by using DySample upsampling. Compared to existing methods, GatedUniPose excels in handling complex scenes and occlusion challenges. Experimental results on the COCO, MPII, and CrowdPose datasets demonstrate that GatedUniPose achieves significant performance improvements with a relatively small number of parameters, yielding better or comparable results to models with similar or larger parameter sizes.
Paper Structure (18 sections, 4 equations, 5 figures, 4 tables)

This paper contains 18 sections, 4 equations, 5 figures, 4 tables.

Figures (5)

  • Figure 1: The comparison of GatedUniPose and advanced methods on the COCO test-dev2017 set regarding model size and precision. The size of each bubble represents the input size of the model.
  • Figure 2: The overall architecture of GatedUniPose.
  • Figure : (a) The overall architecture of SEBlock.
  • Figure : (a) The overall architecture of SEBlock.
  • Figure : (b) The overall architecture of GCB.