Table of Contents
Fetching ...

PoseAdapt: Sustainable Human Pose Estimation via Continual Learning Benchmarks and Toolkit

Muhammad Saif Ullah Khan, Didier Stricker

TL;DR

PoseAdapt tackles the brittleness of static pose estimators by introducing a continual adaptation framework and benchmark suite for 2D human pose estimation. It formalizes domain-incremental and class-incremental tracks with fixed backbones and strict budgets, enabling fair comparison of continual learning strategies such as LFL, LwF, and EWC. The experiments reveal clear stability–plasticity trade-offs, with LFL offering strongest retention across photometric shifts and cross-modality degeneration remaining a key challenge. The work provides a practical pathway toward sustainable, incremental pose estimation suitable for edge deployment and evolving tasks.

Abstract

Human pose estimators are typically retrained from scratch or naively fine-tuned whenever keypoint sets, sensing modalities, or deployment domains change--an inefficient, compute-intensive practice that rarely matches field constraints. We present PoseAdapt, an open-source framework and benchmark suite for continual pose model adaptation. PoseAdapt defines domain-incremental and class-incremental tracks that simulate realistic changes in density, lighting, and sensing modality, as well as skeleton growth. The toolkit supports two workflows: (i) Strategy Benchmarking, which lets researchers implement continual learning (CL) methods as plugins and evaluate them under standardized protocols; and (ii) Model Adaptation, which allows practitioners to adapt strong pretrained models to new tasks with minimal supervision. We evaluate representative regularization-based methods in single-step and sequential settings. Benchmarks enforce a fixed lightweight backbone, no access to past data, and tight per-step budgets. This isolates adaptation strategy effects, highlighting the difficulty of maintaining accuracy under strict resource limits. PoseAdapt connects modern CL techniques with practical pose estimation needs, enabling adaptable models that improve over time without repeated full retraining.

PoseAdapt: Sustainable Human Pose Estimation via Continual Learning Benchmarks and Toolkit

TL;DR

PoseAdapt tackles the brittleness of static pose estimators by introducing a continual adaptation framework and benchmark suite for 2D human pose estimation. It formalizes domain-incremental and class-incremental tracks with fixed backbones and strict budgets, enabling fair comparison of continual learning strategies such as LFL, LwF, and EWC. The experiments reveal clear stability–plasticity trade-offs, with LFL offering strongest retention across photometric shifts and cross-modality degeneration remaining a key challenge. The work provides a practical pathway toward sustainable, incremental pose estimation suitable for edge deployment and evolving tasks.

Abstract

Human pose estimators are typically retrained from scratch or naively fine-tuned whenever keypoint sets, sensing modalities, or deployment domains change--an inefficient, compute-intensive practice that rarely matches field constraints. We present PoseAdapt, an open-source framework and benchmark suite for continual pose model adaptation. PoseAdapt defines domain-incremental and class-incremental tracks that simulate realistic changes in density, lighting, and sensing modality, as well as skeleton growth. The toolkit supports two workflows: (i) Strategy Benchmarking, which lets researchers implement continual learning (CL) methods as plugins and evaluate them under standardized protocols; and (ii) Model Adaptation, which allows practitioners to adapt strong pretrained models to new tasks with minimal supervision. We evaluate representative regularization-based methods in single-step and sequential settings. Benchmarks enforce a fixed lightweight backbone, no access to past data, and tight per-step budgets. This isolates adaptation strategy effects, highlighting the difficulty of maintaining accuracy under strict resource limits. PoseAdapt connects modern CL techniques with practical pose estimation needs, enabling adaptable models that improve over time without repeated full retraining.
Paper Structure (15 sections, 7 equations, 9 figures, 1 table)

This paper contains 15 sections, 7 equations, 9 figures, 1 table.

Figures (9)

  • Figure 1: PoseAdapt Benchmarks. We introduce a diverse suite of domain- and class-incremental benchmarks for human pose estimation. Top: Domain-incremental settings simulate increasing difficulty through scale, perspective, occlusion, lighting, and modality shifts. Bottom: Class-incremental benchmarks gradually add new keypoint types to evaluate the ability to extend skeletons over time. All benchmarks share a fixed backbone, fixed per-step data budget, and unified evaluation protocol.
  • Figure 2: Off-the-shelf models struggle under realistic shifts.Top: Accuracy on the pretrained reference dataset (blue) drops consistently under sequential shifts in density, lighting, and modality (red), even though these changes are relatively minor compared to training conditions. Bottom: When brightness is progressively reduced on the same dataset, AP declines steadily, underscoring the brittleness of static models to illumination variation.
  • Figure 3: Adaptation strategy comparisonTop: Conventional solutions either train models separately (resource waste) or finetune (prone to forgetting). Bottom: PoseAdapt enables adaptation using continual learning (CL) techniques to retain prior knowledge while specializing to new skeletons or domains.
  • Figure 4: PoseAdapt Framework.Left: At each experience $\mathcal{E}_i$, PoseAdapt initializes the model for the new experience (e.g., snapshot creation, head expansion, or architecture-specific adjustments), followed by an adaptation phase that optimizes the model on $\mathcal{D}_i$ with strategy-defined regularization, and a finalization step to compute and store any statistics for later experiences. Right: Head expansion for class-incremental experiences, where the output dimensionality grows as new keypoints are introduced.
  • Figure 5: Reference domain. Examples from the COCO validation set representing the well-lit RGB baseline used for the initial experience in all benchmarks.
  • ...and 4 more figures