DeFNet: Deconstructed Strategy for Multi-step Fabric Folding Tasks
Ningquan Gu, Ruhan He, Lianqing Yu
TL;DR
DeFNet tackles long-horizon fabric folding by decomposing the task into three modules: Folding Planning Module (FPM) operating in latent space to infer the shortest folding paths, Folding Action Module (FAM) using FlowNet-based optical flow to determine grasp-and-place actions, and Iterative Interactive Module (IIM) that continuously re-plans after each action to mitigate execution drift. The FPM leverages a Variational Autoencoder (VAE) and Latent Space Roadmap (LSR) to map start-goal states to a sequence of intermediate states, while FAM computes actions via a flow-based policy with FlowNet and PickNet. The IIM closes the loop by re-inputting the current observation as a new start state and repeating planning and execution until the goal is reached. Across simulation and real-robot experiments, DeFNet outperforms three state-of-the-art baselines and ablations show significant gains from incorporating latent-space folding paths and iterative re-planning, demonstrating robust, scalable fabric folding in multi-step tasks.
Abstract
Fabric folding through robots is complex and challenging due to the deformability of fabric. Based on deconstruction strategy, we split the complex fabric folding task into three relatively simple sub-tasks, and propose a Deconstructed Fabric Folding Network (DeFNet), including corresponding three modules to solve them. (1) We use the Folding Planning Module (FPM), which is based on Latent Space Roadmap, to infer the most straight folding intermediate states from the start to the goal in latent space. (2) We utilize the flow-based approach, Folding Action Module (FAM), to calculate the action coordinates and execute them to reach the inferred intermediate state. (3) We introduce an Iterative Interactive Module (IIM) for fabric folding tasks, which can iteratively execute the FPM and FAM after every grasp-and-place action until the fabric reaches the goal. Experimentally, We demonstrated our method on multi-step fabric folding tasks against three baselines in simulation. We also apply the method to an existing robotic system and present its performance.
