D-Garment: Physics-Conditioned Latent Diffusion for Dynamic Garment Deformations
Antoine Dumoulin, Adnane Boukhayma, Laurence Boissieux, Bharath Bhushan Damodaran, Pierre Hellier, Stefanie Wuhrer
TL;DR
D-Garment introduces a 2D latent diffusion framework to generate temporally coherent dynamic garment deformations conditioned on body shape $β$, motion $θ_t$, and cloth material $γ$, using a UV-space displacement map representation. By operating in UV space on a fixed template, it captures large-scale deformations and fine wrinkles without explicit skinning and is trained on data from a physics-inspired simulator, enabling efficient test-time fitting to vision observations. The model integrates a parametric body model, material parameters (stretch, density, bending), and a diffusion-based generator with losses for temporal consistency and collision penalties, supplemented by a fitting procedure to align with 3D point clouds. Evaluations on simulated and real multi-view data show state-of-the-art Chamfer distance and improved physical plausibility compared to baselines like HOOD and MGDDG, highlighting its potential for dynamic garment rendering and reconstruction in VR/AR and telepresence contexts.
Abstract
Adjusting and deforming 3D garments to body shapes, body motion, and cloth material is an important problem in virtual and augmented reality. Applications are numerous, ranging from virtual change rooms to the entertainment and gaming industry. This problem is challenging as garment dynamics influence geometric details such as wrinkling patterns, which depend on physical input including the wearer's body shape and motion, as well as cloth material features. Existing work studies learning-based modeling techniques to generate garment deformations from example data, and physics-inspired simulators to generate realistic garment dynamics. We propose here a learning-based approach trained on data generated with a physics-based simulator. Compared to prior work, our 3D generative model learns garment deformations for loose cloth geometry, especially for large deformations and dynamic wrinkles driven by body motion and cloth material. Furthermore, the model can be efficiently fitted to observations captured using vision sensors. We propose to leverage the capability of diffusion models to learn fine-scale detail: we model the 3D garment in a 2D parameter space, and learn a latent diffusion model using this representation independent from the mesh resolution. This allows to condition global and local geometric information with body and material information. We quantitatively and qualitatively evaluate our method on both simulated data and data captured with a multi-view acquisition platform. Compared to strong baselines, our method is more accurate in terms of Chamfer distance.
