Diffusing Trajectory Optimization Problems for Recovery During Multi-Finger Manipulation
Abhinav Kumar, Fan Yang, Sergio Aguilera Marinovic, Soshi Iba, Rana Soltani Zarrin, Dmitry Berenson
TL;DR
The paper tackles recovery from perturbations in fine multi-finger manipulation by introducing D-TOUR, a diffusion-based framework that detects when recovery is needed via likelihood-based OOD detection and generates contact-rich recovery trajectories. It combines an offline data-generation pipeline with a diffusion model that distills recovery planning into a joint diffusion over trajectories and contact modes, conditioned on the initial state to ensure feasible, constraint-satisfying recovery. The approach is evaluated on valve and screwdriver turning tasks in simulation and hardware, outperforming RL baselines and methods lacking explicit contact reasoning, and demonstrates faster online planning through diffusion distillation. The work has practical impact for reliable, high-precision manipulation in real-world robotics where perturbations can derail task execution, enabling robust task resumption with interpretable contact control.
Abstract
Multi-fingered hands are emerging as powerful platforms for performing fine manipulation tasks, including tool use. However, environmental perturbations or execution errors can impede task performance, motivating the use of recovery behaviors that enable normal task execution to resume. In this work, we take advantage of recent advances in diffusion models to construct a framework that autonomously identifies when recovery is necessary and optimizes contact-rich trajectories to recover. We use a diffusion model trained on the task to estimate when states are not conducive to task execution, framed as an out-of-distribution detection problem. We then use diffusion sampling to project these states in-distribution and use trajectory optimization to plan contact-rich recovery trajectories. We also propose a novel diffusion-based approach that distills this process to efficiently diffuse the full parameterization, including constraints, goal state, and initialization, of the recovery trajectory optimization problem, saving time during online execution. We compare our method to a reinforcement learning baseline and other methods that do not explicitly plan contact interactions, including on a hardware screwdriver-turning task where we show that recovering using our method improves task performance by 96% and that ours is the only method evaluated that can attempt recovery without causing catastrophic task failure. Videos can be found at https://dtourrecovery.github.io/.
