GatedUniPose: A Novel Approach for Pose Estimation Combining UniRepLKNet and Gated Convolution
Liang Feng, Ming Xu, Lihua Wen, Zhixuan Shen
TL;DR
GatedUniPose tackles robust human pose estimation in complex and occluded scenes by integrating UniRepLKNet with Gated Convolution and embedding enhancements via the GLACE module, complemented by DySample upsampling for improved feature fusion. The approach demonstrates strong accuracy on COCO and MPII with a relatively small parameter footprint, while maintaining competitive performance on CrowdPose. Key contributions include the novel backbone integration of GConv and GLACE, the DySample-based head fusion, and a distillation-based loss strategy, all contributing to improved context-aware joint estimation. This work advances efficient, high-accuracy pose estimation suitable for crowded scenarios and real-world deployment, with code and models slated for public release.
Abstract
Pose estimation is a crucial task in computer vision, with wide applications in autonomous driving, human motion capture, and virtual reality. However, existing methods still face challenges in achieving high accuracy, particularly in complex scenes. This paper proposes a novel pose estimation method, GatedUniPose, which combines UniRepLKNet and Gated Convolution and introduces the GLACE module for embedding. Additionally, we enhance the feature map concatenation method in the head layer by using DySample upsampling. Compared to existing methods, GatedUniPose excels in handling complex scenes and occlusion challenges. Experimental results on the COCO, MPII, and CrowdPose datasets demonstrate that GatedUniPose achieves significant performance improvements with a relatively small number of parameters, yielding better or comparable results to models with similar or larger parameter sizes.
