ESCAPE: Energy-based Selective Adaptive Correction for Out-of-distribution 3D Human Pose Estimation
Luke Bidulka, Mohsen Gholami, Jiannan Zheng, Martin J. McKeown, Z. Jane Wang
TL;DR
ESCAPE tackles the generalization gap in 3D HPE for out-of-distribution data by introducing a lightweight, energy-based selective test-time adaptation framework. It uses a fast correction network (CNet) to fix distal keypoint errors on ID samples and a self-consistent adaptation strategy (RCNet) for OOD samples, guided by a free-energy based OOD detector. The method yields distal MPJPE gains up to 7% and state-of-the-art results on 3DPW and 3DHP while being significantly faster than prior TTA approaches, since restoration of backbone parameters is avoided and adaptation is confined to a small external network. Across multiple backbone models and datasets, ESCAPE demonstrates robust performance improvements with extensive ablations confirming the value of energy-based sample selection and the proposed correction/adaptation scheme. The approach provides a practical pathway to deploy accurate 3D HPE in-the-wild by balancing accuracy gains with inference efficiency.
Abstract
Despite recent advances in human pose estimation (HPE), poor generalization to out-of-distribution (OOD) data remains a difficult problem. While previous works have proposed Test-Time Adaptation (TTA) to bridge the train-test domain gap by refining network parameters at inference, the absence of ground-truth annotations makes it highly challenging and existing methods typically increase inference times by one or more orders of magnitude. We observe that 1) not every test time sample is OOD, and 2) HPE errors are significantly larger on distal keypoints (wrist, ankle). To this end, we propose ESCAPE: a lightweight correction and selective adaptation framework which applies a fast, forward-pass correction on most data while reserving costly TTA for OOD data. The free energy function is introduced to separate OOD samples from incoming data and a correction network is trained to estimate the errors of pretrained backbone HPE predictions on the distal keypoints. For OOD samples, we propose a novel self-consistency adaptation loss to update the correction network by leveraging the constraining relationship between distal keypoints and proximal keypoints (shoulders, hips), via a second ``reverse" network. ESCAPE improves the distal MPJPE of five popular HPE models by up to 7% on unseen data, achieves state-of-the-art results on two popular HPE benchmarks, and is significantly faster than existing adaptation methods.
