Dress-1-to-3: Single Image to Simulation-Ready 3D Outfit with Diffusion Prior and Differentiable Physics
Xuan Li, Chang Yu, Wenxin Du, Ying Jiang, Tianyi Xie, Yunuo Chen, Yin Yang, Chenfanfu Jiang
TL;DR
Dress-1-to-3 tackles the problem of turning a casually posed, single-view image into a physics-aware, simulation-ready 3D garment with separable clothing and a posed human. It introduces a unified IPC differentiable framework that optimizes 2D sewing patterns and 3D garment geometry under differentiable simulation (CIPC), guided by multi-view RGB and normal maps produced by a diffusion prior. The method combines a SewFormer-based initial sewing pattern with differentiable pattern optimization, geometric regularizers, and texture generation to deliver realistic, dynamic garments that are ready for physics-based animation. Its evaluation on CloSe and 4D-Dress shows improved geometry reconstruction, accurate sewing-pattern predictions, and convincing textured garment simulations, with ablations confirming the importance of patch symmetrization and the proposed regularizers. The approach enables practical applications in virtual try-on and animation, while highlighting current limitations in multi-layer garment handling, texture fidelity, and reliance on initial sewing-pattern estimates.
Abstract
Recent advances in large models have significantly advanced image-to-3D reconstruction. However, the generated models are often fused into a single piece, limiting their applicability in downstream tasks. This paper focuses on 3D garment generation, a key area for applications like virtual try-on with dynamic garment animations, which require garments to be separable and simulation-ready. We introduce Dress-1-to-3, a novel pipeline that reconstructs physics-plausible, simulation-ready separated garments with sewing patterns and humans from an in-the-wild image. Starting with the image, our approach combines a pre-trained image-to-sewing pattern generation model for creating coarse sewing patterns with a pre-trained multi-view diffusion model to produce multi-view images. The sewing pattern is further refined using a differentiable garment simulator based on the generated multi-view images. Versatile experiments demonstrate that our optimization approach substantially enhances the geometric alignment of the reconstructed 3D garments and humans with the input image. Furthermore, by integrating a texture generation module and a human motion generation module, we produce customized physics-plausible and realistic dynamic garment demonstrations. Project page: https://dress-1-to-3.github.io/
