Table of Contents
Fetching ...

How Will It Drape Like? Capturing Fabric Mechanics from Depth Images

Carlos Rodriguez-Pardo, Melania Prieto-Martin, Dan Casas, Elena Garces

TL;DR

The paper tackles scalable, perceptually meaningful estimation of fabric mechanics from casual depth captures. It presents a depth-image based pipeline that infers six bending and stretching parameters from two static views in hanging and stretch configurations, using a sim-to-real learning framework and a novel image-domain drape similarity metric aligned with human judgments. A synthetic-then-real evaluation regime, augmented data, and a multi-image fusion capability enable robust generalization to real fabrics captured with consumer hardware, advancing digital twin cloth representations. The perceptual drape metric and ablation studies demonstrate that parameter-space errors do not always reflect perceptual drape quality, underlining the importance of domain-aligned evaluation for cloth systems.

Abstract

We propose a method to estimate the mechanical parameters of fabrics using a casual capture setup with a depth camera. Our approach enables to create mechanically-correct digital representations of real-world textile materials, which is a fundamental step for many interactive design and engineering applications. As opposed to existing capture methods, which typically require expensive setups, video sequences, or manual intervention, our solution can capture at scale, is agnostic to the optical appearance of the textile, and facilitates fabric arrangement by non-expert operators. To this end, we propose a sim-to-real strategy to train a learning-based framework that can take as input one or multiple images and outputs a full set of mechanical parameters. Thanks to carefully designed data augmentation and transfer learning protocols, our solution generalizes to real images despite being trained only on synthetic data, hence successfully closing the sim-to-real loop.Key in our work is to demonstrate that evaluating the regression accuracy based on the similarity at parameter space leads to an inaccurate distances that do not match the human perception. To overcome this, we propose a novel metric for fabric drape similarity that operates on the image domain instead on the parameter space, allowing us to evaluate our estimation within the context of a similarity rank. We show that out metric correlates with human judgments about the perception of drape similarity, and that our model predictions produce perceptually accurate results compared to the ground truth parameters.

How Will It Drape Like? Capturing Fabric Mechanics from Depth Images

TL;DR

The paper tackles scalable, perceptually meaningful estimation of fabric mechanics from casual depth captures. It presents a depth-image based pipeline that infers six bending and stretching parameters from two static views in hanging and stretch configurations, using a sim-to-real learning framework and a novel image-domain drape similarity metric aligned with human judgments. A synthetic-then-real evaluation regime, augmented data, and a multi-image fusion capability enable robust generalization to real fabrics captured with consumer hardware, advancing digital twin cloth representations. The perceptual drape metric and ablation studies demonstrate that parameter-space errors do not always reflect perceptual drape quality, underlining the importance of domain-aligned evaluation for cloth systems.

Abstract

We propose a method to estimate the mechanical parameters of fabrics using a casual capture setup with a depth camera. Our approach enables to create mechanically-correct digital representations of real-world textile materials, which is a fundamental step for many interactive design and engineering applications. As opposed to existing capture methods, which typically require expensive setups, video sequences, or manual intervention, our solution can capture at scale, is agnostic to the optical appearance of the textile, and facilitates fabric arrangement by non-expert operators. To this end, we propose a sim-to-real strategy to train a learning-based framework that can take as input one or multiple images and outputs a full set of mechanical parameters. Thanks to carefully designed data augmentation and transfer learning protocols, our solution generalizes to real images despite being trained only on synthetic data, hence successfully closing the sim-to-real loop.Key in our work is to demonstrate that evaluating the regression accuracy based on the similarity at parameter space leads to an inaccurate distances that do not match the human perception. To overcome this, we propose a novel metric for fabric drape similarity that operates on the image domain instead on the parameter space, allowing us to evaluate our estimation within the context of a similarity rank. We show that out metric correlates with human judgments about the perception of drape similarity, and that our model predictions produce perceptually accurate results compared to the ground truth parameters.
Paper Structure (10 sections, 5 figures)

This paper contains 10 sections, 5 figures.

Figures (5)

  • Figure 1: Capture setup, RGB (top), and depth (bottom) images for hanging (left) and stretch (right) scenes in rest position. Each scene conveys a different mechanical appearance of the fabric: hanging exhibits the overall drape; stretch exhibits an extra diagonal tension, which is key to understand the stretching properties.
  • Figure 2: An overview of the main components of our method. We propose a technique to estimate fabric mechanics using depth images of hanging and stretch scenes as input. To validate the error of our estimations in a perceptual manner --accounting for the global drape--, we propose an image-based drape similarity metric which we validate with human judgments and can be used to sort fabrics by similarity. We show through several metrics that the estimations provided by our method using our similarity metric agree with those given by humans.
  • Figure 3: Spearman correlation matrix between parameters of our synthetic dataset.
  • Figure 4: Sweep of simulation parameters for hanging and stretch scenes. For kBending: warp, weft, and bias have the same value, while for kStretch, bias changes as 100, 144, 1000.
  • Figure 5: Diagram of our training and evaluation pipelines. For training, we use a single image of the material, along with its density. The image is processed by our Feature Extractor, followed by an MLP, which computes the parameter estimation $\hat{\mathcal{P}}$. For evaluation, we process each available image with our trained feature extractor and use a fusion operator before feeding it to the trained regressor. We use the same Feature Extractor and MLP for both scenes.