Table of Contents
Fetching ...

AdaFold: Adapting Folding Trajectories of Cloths via Feedback-loop Manipulation

Alberta Longhini, Michael C. Welle, Zackory Erickson, Danica Kragic

TL;DR

AdaFold addresses robust cloth folding under varying properties by combining a particle-based cloth state with semantic descriptors and a model-based feedback-loop. The approach uses a learned forward model $f_\theta$ and an adaptation module $g_\psi$ to encode history into a latent $z_t$, enabling online replanning with MPPI over horizon $H$ as actions $a^*_{0:T}$ minimize $\mathcal{J}(\tau_{0:T})$ subject to $P_{t+1}=f(P_t,x_t,a_t,\xi)$. Perception builds a semantically labeled cloth point cloud $P=P^U\cup P^B$ from RGB-D using segmented masks and tracks layers with video trackers, improving state discrimination under deformation. Experimental results in simulation and on real cloths show AdaFold outperforms fixed trajectories and model-free baselines, generalizes to unseen shapes and properties, and benefits from semantic cloth representations for more accurate state estimation.

Abstract

We present AdaFold, a model-based feedback-loop framework for optimizing folding trajectories. AdaFold extracts a particle-based representation of cloth from RGB-D images and feeds back the representation to a model predictive control to replan folding trajectory at every time step. A key component of AdaFold that enables feedback-loop manipulation is the use of semantic descriptors extracted from geometric features. These descriptors enhance the particle representation of the cloth to distinguish between ambiguous point clouds of differently folded cloths. Our experiments demonstrate AdaFold's ability to adapt folding trajectories of cloths with varying physical properties and generalize from simulated training to real-world execution.

AdaFold: Adapting Folding Trajectories of Cloths via Feedback-loop Manipulation

TL;DR

AdaFold addresses robust cloth folding under varying properties by combining a particle-based cloth state with semantic descriptors and a model-based feedback-loop. The approach uses a learned forward model and an adaptation module to encode history into a latent , enabling online replanning with MPPI over horizon as actions minimize subject to . Perception builds a semantically labeled cloth point cloud from RGB-D using segmented masks and tracks layers with video trackers, improving state discrimination under deformation. Experimental results in simulation and on real cloths show AdaFold outperforms fixed trajectories and model-free baselines, generalizes to unseen shapes and properties, and benefits from semantic cloth representations for more accurate state estimation.

Abstract

We present AdaFold, a model-based feedback-loop framework for optimizing folding trajectories. AdaFold extracts a particle-based representation of cloth from RGB-D images and feeds back the representation to a model predictive control to replan folding trajectory at every time step. A key component of AdaFold that enables feedback-loop manipulation is the use of semantic descriptors extracted from geometric features. These descriptors enhance the particle representation of the cloth to distinguish between ambiguous point clouds of differently folded cloths. Our experiments demonstrate AdaFold's ability to adapt folding trajectories of cloths with varying physical properties and generalize from simulated training to real-world execution.
Paper Structure (21 sections, 6 equations, 7 figures, 5 tables, 1 algorithm)

This paper contains 21 sections, 6 equations, 7 figures, 5 tables, 1 algorithm.

Figures (7)

  • Figure 1: AdaFold successfully adapts the folding trajectories of the two cloths with different physical properties, achieving a better folding than a predefined triangular trajectory.
  • Figure 2: Overview of AdaFold for feedback-loop manipulation of cloths. Given a set of pick-and-place positions $(x_{pick}, x_{place})$, AdaFold optimizes the best folding action $a^*_t$ at each time-step $t$. RGB-D observations from different calibrated cameras are used to extract point cloud representations with semantic descriptors. The semantic descriptors Upper and Bottom are obtained based on geometric features following miller2011parametrized. The optimal folding action $a^*_t$ is obtained with MPC, which uses the forward and adaptation modules $f_\theta$ and $g_\psi$ to evaluate the candidate trajectories $a^n$ (light blue) and update the optimal control sequence $a^*$ (dark blue).
  • Figure 3: Cloth labeling: Given the cloth mask at time $t=0$, we obtain fold lines based on geometric features following miller2011parametrized. These fold lines define the upper and bottom layers based on the pick and place locations.
  • Figure 4: Visualization of the real-world set-up: two Realsense D435 cameras capturing different views of the scene, and the dataset composed of $7$ cloths.
  • Figure 5: Cost and action initialization ablation. The reference performance of the Triangular trajectory is shown as the red horizontal dashed line. The fold is executed 20 times for each ablation.
  • ...and 2 more figures