Table of Contents
Fetching ...

Unsupervised Domain Adaptation for Occlusion Resilient Human Pose Estimation

Arindam Dutta, Sarosij Bose, Saketh Bachu, Calvin-Khang Ta, Konstantinos Karydis, Amit K. Roy-Chowdhury

TL;DR

OR-POSE tackles the problem of unsupervised domain adaptation for occlusion-resilient 2D human pose estimation by combining a mean-teacher pseudo-labeling scheme with a learned human pose prior and a visibility-guided curriculum. The method augments source data with occlusions, uses an EMA-based teacher to generate reliable pseudo-labels, and enforces anatomical plausibility through a zero-level set prior, all while progressively focusing on less-occluded samples before harder cases. Empirical results show approximately 7% absolute gains in PCK$@0.05$ on occluded target datasets, with maintained performance on unoccluded data, demonstrating practical robustness to occlusion and domain shifts. The approach offers a scalable, annotation-efficient pathway for deploying pose estimation systems in real-world, occlusion-rich environments.

Abstract

Occlusions are a significant challenge to human pose estimation algorithms, often resulting in inaccurate and anatomically implausible poses. Although current occlusion-robust human pose estimation algorithms exhibit impressive performance on existing datasets, their success is largely attributed to supervised training and the availability of additional information, such as multiple views or temporal continuity. Furthermore, these algorithms typically suffer from performance degradation under distribution shifts. While existing domain adaptive human pose estimation algorithms address this bottleneck, they tend to perform suboptimally when the target domain images are occluded, a common occurrence in real-life scenarios. To address these challenges, we propose OR-POSE: Unsupervised Domain Adaptation for Occlusion Resilient Human POSE Estimation. OR-POSE is an innovative unsupervised domain adaptation algorithm which effectively mitigates domain shifts and overcomes occlusion challenges by employing the mean teacher framework for iterative pseudo-label refinement. Additionally, OR-POSE reinforces realistic pose prediction by leveraging a learned human pose prior which incorporates the anatomical constraints of humans in the adaptation process. Lastly, OR-POSE avoids overfitting to inaccurate pseudo labels generated from heavily occluded images by employing a novel visibility-based curriculum learning approach. This enables the model to gradually transition from training samples with relatively less occlusion to more challenging, heavily occluded samples. Extensive experiments show that OR-POSE outperforms existing analogous state-of-the-art algorithms by $\sim$ 7% on challenging occluded human pose estimation datasets.

Unsupervised Domain Adaptation for Occlusion Resilient Human Pose Estimation

TL;DR

OR-POSE tackles the problem of unsupervised domain adaptation for occlusion-resilient 2D human pose estimation by combining a mean-teacher pseudo-labeling scheme with a learned human pose prior and a visibility-guided curriculum. The method augments source data with occlusions, uses an EMA-based teacher to generate reliable pseudo-labels, and enforces anatomical plausibility through a zero-level set prior, all while progressively focusing on less-occluded samples before harder cases. Empirical results show approximately 7% absolute gains in PCK on occluded target datasets, with maintained performance on unoccluded data, demonstrating practical robustness to occlusion and domain shifts. The approach offers a scalable, annotation-efficient pathway for deploying pose estimation systems in real-world, occlusion-rich environments.

Abstract

Occlusions are a significant challenge to human pose estimation algorithms, often resulting in inaccurate and anatomically implausible poses. Although current occlusion-robust human pose estimation algorithms exhibit impressive performance on existing datasets, their success is largely attributed to supervised training and the availability of additional information, such as multiple views or temporal continuity. Furthermore, these algorithms typically suffer from performance degradation under distribution shifts. While existing domain adaptive human pose estimation algorithms address this bottleneck, they tend to perform suboptimally when the target domain images are occluded, a common occurrence in real-life scenarios. To address these challenges, we propose OR-POSE: Unsupervised Domain Adaptation for Occlusion Resilient Human POSE Estimation. OR-POSE is an innovative unsupervised domain adaptation algorithm which effectively mitigates domain shifts and overcomes occlusion challenges by employing the mean teacher framework for iterative pseudo-label refinement. Additionally, OR-POSE reinforces realistic pose prediction by leveraging a learned human pose prior which incorporates the anatomical constraints of humans in the adaptation process. Lastly, OR-POSE avoids overfitting to inaccurate pseudo labels generated from heavily occluded images by employing a novel visibility-based curriculum learning approach. This enables the model to gradually transition from training samples with relatively less occlusion to more challenging, heavily occluded samples. Extensive experiments show that OR-POSE outperforms existing analogous state-of-the-art algorithms by 7% on challenging occluded human pose estimation datasets.
Paper Structure (15 sections, 12 equations, 7 figures, 6 tables)

This paper contains 15 sections, 12 equations, 7 figures, 6 tables.

Figures (7)

  • Figure 1: Need for unsupervised domain adaptation for occlusion resilient human pose estimation.Left: Predictions of the model trained exclusively on labeled source data (SURREAL) and evaluated on an image from 3DOH50K dataset (referred to as Source only predictions). Middle: Predictions from state-of-the-art domain adaptive human pose estimation algorithm UniFrame kim2022unified. Right: Predictions from our proposed occlusion resilient algorithm (OR-POSE). While UniFrame kim2022unified provides improved pose estimates as compared against Source only predictions through unsupervised domain adaptation to the unlabeled target domain, it still fails to deliver optimal pose estimates under occlusions. In contrast, our proposed algorithm (OR-POSE) achieves optimal human pose estimates under unsupervised domain adaptive settings even in the presence of occlusions.
  • Figure 2: Problem Setup. We propose OR-POSE, an unsupervised algorithm for progressively adapting a model to occlusions. OR-POSE leverages pseudo labels from the mean-teacher framework to provide guidance to the model while utilizing a pose prior to generate physically plausible poses for humans. Further, to prevent the early-overfitting to noisy pseudo labels and accounting for the uneven levels of occlusion present in the target domain, we suggest a curriculum learning based strategy to make the model learn from visible samples to harder (more occluded) samples.
  • Figure 3: Overview of proposed methodology:OR-POSE is built upon the mean-teacher framework, where the weights of the teacher model are updated as an exponential moving average (EMA) of the student model's weights. OR-POSE uses occlusion augmentations on the source domain, enabling the student and teacher models to provide better pseudo labels on unlabeled target images by learning consistency between occluded and unoccluded source images. OR-POSE also utilizes a human pose prior that captures plausible human anatomy as a zero-level set, thus penalizing anatomically implausible predictions. Finally, OR-POSE leverages a visibility-based curriculum learning strategy wherein the model focuses on samples with fewer occlusions and gradually shifts to samples with higher degrees of occlusion.
  • Figure 4: Qualitative Results for SURREAL$\rightarrow$ 3DOH50K. From left to right:Source only predictions, prediction from UniFrame kim2022unified, and predictions our proposed algorithm OR-POSE.
  • Figure 5: Qualitative Results for SURREAL$\rightarrow$ Ocl-LSP. From left to right:Source only predictions, prediction from UniFrame kim2022unified, and predictions our proposed algorithm OR-POSE.
  • ...and 2 more figures