Cascaded Diffusion Models for Neural Motion Planning
Mohit Sharma, Adam Fishman, Vikash Kumar, Chris Paxton, Oliver Kroemer
TL;DR
The paper tackles global motion planning from raw perception in cluttered environments by introducing a cascaded hierarchy of diffusion models that generate coarse global plans and progressively refine them to satisfy local constraints. A plan-patching refinement step further enhances robustness to constraint violations, enabling collision-free, long-horizon trajectories. Empirical results across 2D navigation and 7D manipulation domains show the cascaded diffusion approach outperforms a range of baselines by several percentage points, with improved robustness when using reference trajectories and plan refinement. The work demonstrates practical impact by enabling more reliable, perception-driven motion planning in complex scenes, while acknowledging runtime as an area for future improvement through faster diffusion techniques.
Abstract
Robots in the real world need to perceive and move to goals in complex environments without collisions. Avoiding collisions is especially difficult when relying on sensor perception and when goals are among clutter. Diffusion policies and other generative models have shown strong performance in solving local planning problems, but often struggle at avoiding all of the subtle constraint violations that characterize truly challenging global motion planning problems. In this work, we propose an approach for learning global motion planning using diffusion policies, allowing the robot to generate full trajectories through complex scenes and reasoning about multiple obstacles along the path. Our approach uses cascaded hierarchical models which unify global prediction and local refinement together with online plan repair to ensure the trajectories are collision free. Our method outperforms (by ~5%) a wide variety of baselines on challenging tasks in multiple domains including navigation and manipulation.
