KBody: Towards general, robust, and aligned monocular whole-body estimation
Nikolaos Zioulis, James F. O'Brien
TL;DR
KBody tackles robust monocular whole-body estimation by marrying data-driven priors with an optimization-based fitting stage. It introduces a two-stage pipeline that optionally completes partial images using a StyleGAN-Human prior, then fits a parametric body model with a disentangled optimization loop, augmented by virtual joints and an asymmetric distance-field silhouette objective. Key contributions include the virtual joints for improved keypoint correspondence, the disentangled body optimization to balance pose and shape, and the asymmetric distance field that robustly guides silhouette alignment. The approach yields improved pixel alignment and competitive pose/shape accuracy on challenging in-the-wild data, while highlighting trade-offs between speed and accuracy compared to single-shot estimators. This work advances practical monocular body fitting by enabling robust, partially-observed, and metrically coherent estimates that can support downstream applications like avatar creation and virtual try-on.
Abstract
KBody is a method for fitting a low-dimensional body model to an image. It follows a predict-and-optimize approach, relying on data-driven model estimates for the constraints that will be used to solve for the body's parameters. Acknowledging the importance of high quality correspondences, it leverages ``virtual joints" to improve fitting performance, disentangles the optimization between the pose and shape parameters, and integrates asymmetric distance fields to strike a balance in terms of pose and shape capturing capacity, as well as pixel alignment. We also show that generative model inversion offers a strong appearance prior that can be used to complete partial human images and used as a building block for generalized and robust monocular body fitting. Project page: https://zokin.github.io/KBody.
