Pro-Pose: Unpaired Full-Body Portrait Synthesis via Canonical UV Maps
Sandeep Mishra, Yasamin Jafarian, Andreas Lugmayr, Yingwei Li, Varsha Ramakrishnan, Srivatsan Varadharajan, Alan C. Bovik, Ira Kemelmacher-Shlizerman
TL;DR
Pro-Pose introduces a self-supervised framework for unpaired full-body portrait synthesis in canonical UV space, decoupling pose from texture to enable robust reposing from a single image. A novel Donor-based UV Reposing mechanism prevents pose leakage through occlusion boundaries, allowing learning from large unpaired datasets combined with scarce paired data. The model uses a Flow Matching–based generator in a latent UV space and supports test-time personalization via LoRA-based fine-tuning, producing identity-faithful avatars under novel poses. Across DeepFashion and WPose benchmarks, Pro-Pose achieves state-of-the-art fidelity and strong generalization to in-the-wild imagery, with ablations highlighting the importance of the hybrid data strategy and the personalization capability.
Abstract
Photographs of people taken by professional photographers typically present the person in beautiful lighting, with an interesting pose, and flattering quality. This is unlike common photos people can take of themselves. In this paper, we explore how to create a ``professional'' version of a person's photograph, i.e., in a chosen pose, in a simple environment, with good lighting, and standard black top/bottom clothing. A key challenge is to preserve the person's unique identity, face and body features while transforming the photo. If there would exist a large paired dataset of the same person photographed both ``in the wild'' and by a professional photographer, the problem would potentially be easier to solve. However, such data does not exist, especially for a large variety of identities. To that end, we propose two key insights: 1) Our method transforms the input photo and person's face to a canonical UV space, which is further coupled with reposing methodology to model occlusions and novel view synthesis. Operating in UV space allows us to leverage existing unpaired datasets. 2) We personalize the output photo via multi image finetuning. Our approach yields high-quality, reposed portraits and achieves strong qualitative and quantitative performance on real-world imagery.
