Table of Contents
Fetching ...

POSTURE: Pose Guided Unsupervised Domain Adaptation for Human Body Part Segmentation

Arindam Dutta, Rohit Lal, Yash Garg, Calvin-Khang Ta, Dripta S. Raychaudhuri, Hannah Dela Cruz, Amit K. Roy-Chowdhury

TL;DR

POSTURE tackles domain shifts in human body part segmentation by embedding pose-based anatomical priors into unsupervised domain adaptation. It introduces a learned pose-to-segmentation mapping G and leverages reliable pseudo-labels, plus an optional source-free variant SF-POSTURE, to adapt a source-trained model to unlabeled targets. Across multiple benchmarks, POSTURE achieves approximately an 8 percentage-point improvement over prior UDA methods and maintains strong performance in source-free settings, while producing anatomically coherent masks. The approach reduces labeling requirements and privacy concerns, offering a practical, scalable solution for robust body-part parsing under distribution shifts.

Abstract

Existing algorithms for human body part segmentation have shown promising results on challenging datasets, primarily relying on end-to-end supervision. However, these algorithms exhibit severe performance drops in the face of domain shifts, leading to inaccurate segmentation masks. To tackle this issue, we introduce POSTURE: \underline{Po}se Guided Un\underline{s}upervised Domain Adap\underline{t}ation for H\underline{u}man Body Pa\underline{r}t S\underline{e}gmentation - an innovative pseudo-labelling approach designed to improve segmentation performance on the unlabeled target data. Distinct from conventional domain adaptive methods for general semantic segmentation, POSTURE stands out by considering the underlying structure of the human body and uses anatomical guidance from pose keypoints to drive the adaptation process. This strong inductive prior translates to impressive performance improvements, averaging 8\% over existing state-of-the-art domain adaptive semantic segmentation methods across three benchmark datasets. Furthermore, the inherent flexibility of our proposed approach facilitates seamless extension to source-free settings (SF-POSTURE), effectively mitigating potential privacy and computational concerns, with negligible drop in performance.

POSTURE: Pose Guided Unsupervised Domain Adaptation for Human Body Part Segmentation

TL;DR

POSTURE tackles domain shifts in human body part segmentation by embedding pose-based anatomical priors into unsupervised domain adaptation. It introduces a learned pose-to-segmentation mapping G and leverages reliable pseudo-labels, plus an optional source-free variant SF-POSTURE, to adapt a source-trained model to unlabeled targets. Across multiple benchmarks, POSTURE achieves approximately an 8 percentage-point improvement over prior UDA methods and maintains strong performance in source-free settings, while producing anatomically coherent masks. The approach reduces labeling requirements and privacy concerns, offering a practical, scalable solution for robust body-part parsing under distribution shifts.

Abstract

Existing algorithms for human body part segmentation have shown promising results on challenging datasets, primarily relying on end-to-end supervision. However, these algorithms exhibit severe performance drops in the face of domain shifts, leading to inaccurate segmentation masks. To tackle this issue, we introduce POSTURE: \underline{Po}se Guided Un\underline{s}upervised Domain Adap\underline{t}ation for H\underline{u}man Body Pa\underline{r}t S\underline{e}gmentation - an innovative pseudo-labelling approach designed to improve segmentation performance on the unlabeled target data. Distinct from conventional domain adaptive methods for general semantic segmentation, POSTURE stands out by considering the underlying structure of the human body and uses anatomical guidance from pose keypoints to drive the adaptation process. This strong inductive prior translates to impressive performance improvements, averaging 8\% over existing state-of-the-art domain adaptive semantic segmentation methods across three benchmark datasets. Furthermore, the inherent flexibility of our proposed approach facilitates seamless extension to source-free settings (SF-POSTURE), effectively mitigating potential privacy and computational concerns, with negligible drop in performance.
Paper Structure (18 sections, 11 equations, 6 figures, 7 tables)

This paper contains 18 sections, 11 equations, 6 figures, 7 tables.

Figures (6)

  • Figure 1: Need for domain adaptive human body part segmentation.Left to right: RGB image cornett2023expanding, masks predicted from a model trained on synthetic data varol2017learning, predictions from a SOTA UDA algorithm for semantic segmentation vu2019advent, and predictions from POSTURE. The model trained on synthetic data produces highly inaccurate predictions due to the inherent distribution shift between synthetic and real images. While domain adaptive semantic segmentation vu2019advent improves the result, their inability to account for the inherent anatomical structure leads to sub-optimal segmentation. In contrast, POSTURE considers the underlying anatomical structure of the human body, delivering accurate segmentation masks for human body parts under domain shifts.
  • Figure 2: Problem overview. We introduce POSTURE for domain-adaptive human body part segmentation. By leveraging confident pseudo-labels and explicit anatomical guidance from pose keypoints, POSTURE delivers improved segmentation masks in the presence of domain shifts. A pre-trained parametric mapping is utilized to align the estimated pose keypoints with body part segmentation maps. This mapping acts as strong prior in the face of domain shifts, disallowing the target model to overfit to anatomically implausible segmentation maps. We also extend POSTURE to source-free UDA settings (SF-POSTURE), thereby addressing privacy and storage bottlenecks associated with using source data for adaptation.
  • Figure 3: Overview of proposed framework. Our proposed algorithm POSTURE uses confident pseudo-labels from $\mathcal{F_{T}}$ to refine the predictions of the same on unlabeled data $\mathcal{T}$. Further, POSTURE leverages anatomical context of the human body through pose estimates (obtained from $\mathcal{P}$) aligned with body part segmentation masks by $\mathcal{G}$. Note that POSTURE can be extended to source-free settings, where $\{ x_{s}, y_{s} \}$ are absent (elaborated in Section \ref{['method-SFPIPS']}). Even in the absence of source-data, the anatomical context obtained from $\mathcal{G} \circ \mathcal{P}$ acts as a strong regularizer, mitigating catastrophic forgetting and compensating for domain shifts.
  • Figure 4: Learning algorithm for $\mathcal{G}$. The model $\mathcal{G}$ is designed by predict segmentation maps $\hat{s} \in \mathbb{R}^{H \times W \times K}$ given pose keypoints $p \in \mathbb{R}^{P \times 2}$. $\mathcal{G}$ is trained using the cross-entropy loss with ground-truth segmentation maps $s \in \mathbb{R}^{H \times W \times K}$.
  • Figure 5: Qualitative Results on H36Mh36m_pami and UPLassner:UP:2017 datasets.: Left column show qualitative results for SUR varol2017learning$\rightarrow$ H36M h36m_pami and right column show qualitative results for SUR varol2017learning$\rightarrow$ UP Lassner:UP:2017. Left to right: RGB image h36m_pamiLassner:UP:2017, Source only predictions, estimated pose keypoints using UDAPE kim2022unified predictions of AdvEnt vu2019advent, predictions from POSTURE, and predictions of SF-POSTURE.
  • ...and 1 more figures