CUPS: Improving Human Pose-Shape Estimators with Conformalized Deep Uncertainty

Harry Zhang; Luca Carlone

CUPS: Improving Human Pose-Shape Estimators with Conformalized Deep Uncertainty

Harry Zhang, Luca Carlone

TL;DR

CUPS tackles reliable 3D human pose–shape reconstruction from monocular video by integrating a learned deep uncertainty function with conformal prediction to provide calibrated, probabilistic guarantees. It introduces a transformer-based GLoT architecture to predict SMPL parameters and simultaneously learns $S_\theta(\boldsymbol{X}, \boldsymbol{Y})$, a conformity score, enabling a Deep Uncertainty Conformal Set with threshold $\tau^*$ that accounts for non-exchangeable data. The approach yields state-of-the-art accuracy on multiple datasets and supports multi-hypothesis predictions via Monte Carlo Dropout, while offering two practical bounds on the miscoverage gap to justify uncertainty guarantees. Overall, CUPS advances uncertainty-aware 3D human reconstruction with principled statistical guarantees applicable to safety-critical vision tasks.

Abstract

We introduce CUPS, a novel method for learning sequence-to-sequence 3D human shapes and poses from RGB videos with uncertainty quantification. To improve on top of prior work, we develop a method to generate and score multiple hypotheses during training, effectively integrating uncertainty quantification into the learning process. This process results in a deep uncertainty function that is trained end-to-end with the 3D pose estimator. Post-training, the learned deep uncertainty model is used as the conformity score, which can be used to calibrate a conformal predictor in order to assess the quality of the output prediction. Since the data in human pose-shape learning is not fully exchangeable, we also present two practical bounds for the coverage gap in conformal prediction, developing theoretical backing for the uncertainty bound of our model. Our results indicate that by taking advantage of deep uncertainty with conformal prediction, our method achieves state-of-the-art performance across various metrics and datasets while inheriting the probabilistic guarantees of conformal prediction.

CUPS: Improving Human Pose-Shape Estimators with Conformalized Deep Uncertainty

TL;DR

Abstract

CUPS: Improving Human Pose-Shape Estimators with Conformalized Deep Uncertainty

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (15)