Table of Contents
Fetching ...

Cascaded Diffusion Models for Neural Motion Planning

Mohit Sharma, Adam Fishman, Vikash Kumar, Chris Paxton, Oliver Kroemer

TL;DR

The paper tackles global motion planning from raw perception in cluttered environments by introducing a cascaded hierarchy of diffusion models that generate coarse global plans and progressively refine them to satisfy local constraints. A plan-patching refinement step further enhances robustness to constraint violations, enabling collision-free, long-horizon trajectories. Empirical results across 2D navigation and 7D manipulation domains show the cascaded diffusion approach outperforms a range of baselines by several percentage points, with improved robustness when using reference trajectories and plan refinement. The work demonstrates practical impact by enabling more reliable, perception-driven motion planning in complex scenes, while acknowledging runtime as an area for future improvement through faster diffusion techniques.

Abstract

Robots in the real world need to perceive and move to goals in complex environments without collisions. Avoiding collisions is especially difficult when relying on sensor perception and when goals are among clutter. Diffusion policies and other generative models have shown strong performance in solving local planning problems, but often struggle at avoiding all of the subtle constraint violations that characterize truly challenging global motion planning problems. In this work, we propose an approach for learning global motion planning using diffusion policies, allowing the robot to generate full trajectories through complex scenes and reasoning about multiple obstacles along the path. Our approach uses cascaded hierarchical models which unify global prediction and local refinement together with online plan repair to ensure the trajectories are collision free. Our method outperforms (by ~5%) a wide variety of baselines on challenging tasks in multiple domains including navigation and manipulation.

Cascaded Diffusion Models for Neural Motion Planning

TL;DR

The paper tackles global motion planning from raw perception in cluttered environments by introducing a cascaded hierarchy of diffusion models that generate coarse global plans and progressively refine them to satisfy local constraints. A plan-patching refinement step further enhances robustness to constraint violations, enabling collision-free, long-horizon trajectories. Empirical results across 2D navigation and 7D manipulation domains show the cascaded diffusion approach outperforms a range of baselines by several percentage points, with improved robustness when using reference trajectories and plan refinement. The work demonstrates practical impact by enabling more reliable, perception-driven motion planning in complex scenes, while acknowledging runtime as an area for future improvement through faster diffusion techniques.

Abstract

Robots in the real world need to perceive and move to goals in complex environments without collisions. Avoiding collisions is especially difficult when relying on sensor perception and when goals are among clutter. Diffusion policies and other generative models have shown strong performance in solving local planning problems, but often struggle at avoiding all of the subtle constraint violations that characterize truly challenging global motion planning problems. In this work, we propose an approach for learning global motion planning using diffusion policies, allowing the robot to generate full trajectories through complex scenes and reasoning about multiple obstacles along the path. Our approach uses cascaded hierarchical models which unify global prediction and local refinement together with online plan repair to ensure the trajectories are collision free. Our method outperforms (by ~5%) a wide variety of baselines on challenging tasks in multiple domains including navigation and manipulation.

Paper Structure

This paper contains 11 sections, 6 figures, 3 tables.

Figures (6)

  • Figure 1: Overview of our cascaded diffusion model approach. The higher level model generates coarse plan as sub-goals and reference points. The lower level model uses them as input to output a plan that satisfies local constraints.
  • Figure 2: We use the lowest level model in our cascaded hierarchy of diffusion models to refine paths. This diffusion model operates locally with the same planning horizon as the model directly above it.
  • Figure 3: Our cascaded diffusion model architecture uses high-dimensional observations (point clouds) and coarse plans from high-level model to output complete plans using a diffusion process.
  • Figure 4: Overview of the environments we use to evaluate our approach. We use a 2D navigation problem (left) as well as multiple simulated reaching tasks using a simulated Franka Panda arm fishman2023motion.
  • Figure 5: Qualitative results. Red spheres show collisions with the environment. Note that many collisions are very subtle, and the optimal solution is very close to being in collision; this is part of the difficulty of our problem setting.
  • ...and 1 more figures