GARField: Addressing the visual Sim-to-Real gap in garment manipulation with mesh-attached radiance fields
Donatien Delehelle, Darwin G. Caldwell, Fei Chen
TL;DR
GARField tackles the data bottleneck in deformable garment manipulation by learning a differentiable rendering pipeline that generates realistic observations from simulated garment states. It defines a mesh-attached, scene-embedded representation using signed distance and visual feature fields, with a two-stage process of scene capture and re-rendering to produce labeled data across novel poses. The approach introduces mesh-based coordinates via barycentric embeddings and Laplacian positional embeddings, coupled with a four-term training loss and view-direction augmentation, achieving faithful reconstruction and re-posed rendering under limited viewpoints. This work potentially enables robots to “imagine” manipulation outcomes in observation space, reducing reliance on costly real-world data and bridging the sim-to-real gap in textile manipulation, albeit at substantial computational cost. The approach sets a foundation for higher-fidelity, differentiable garment rendering inspired by NeRF-like techniques for dynamic, real-world deployment.
Abstract
While humans intuitively manipulate garments and other textile items swiftly and accurately, it is a significant challenge for robots. A factor crucial to human performance is the ability to imagine, a priori, the intended result of the manipulation intents and hence develop predictions on the garment pose. That ability allows us to plan from highly obstructed states, adapt our plans as we collect more information and react swiftly to unforeseen circumstances. Conversely, robots struggle to establish such intuitions and form tight links between plans and observations. We can partly attribute this to the high cost of obtaining densely labelled data for textile manipulation, both in quality and quantity. The problem of data collection is a long-standing issue in data-based approaches to garment manipulation. As of today, generating high-quality and labelled garment manipulation data is mainly attempted through advanced data capture procedures that create simplified state estimations from real-world observations. However, this work proposes a novel approach to the problem by generating real-world observations from object states. To achieve this, we present GARField (Garment Attached Radiance Field), the first differentiable rendering architecture, to our knowledge, for data generation from simulated states stored as triangle meshes. Code is available on https://ddonatien.github.io/garfield-website/
