DiPPeR: Diffusion-based 2D Path Planner applied on Legged Robots

Jianwei Liu; Maria Stamatopoulou; Dimitrios Kanoulas

DiPPeR: Diffusion-based 2D Path Planner applied on Legged Robots

Jianwei Liu, Maria Stamatopoulou, Dimitrios Kanoulas

TL;DR

DiPPeR presents a diffusion-based 2D path planner for quadrupedal robots, conditioning trajectory generation on map images and inpainting-based start/goal positions. It trains a DDPM with a ResNet-18 visual encoder and FiLM conditioning on a large dataset of $10{,}000$ maps with $100$ trajectories each, using a horizon of $T=180$ and diffusion steps $k=1000$, to produce trajectories $A_t$. Inference achieves about $0.4$ s per trajectory and an average feasibility of ${87\%}$, outperforming $A^*$, Neural $A^*$, and ViT-$A^*$ by roughly ${23\times}$ on varying map sizes and obstacle structures; real-world deployment on Spot and Go1 demonstrates platform-agnostic integration with ROS navigation and local planners. The work highlights both the practical potential and current limits, notably dependence on horizon estimation, with future directions including transformer-based diffusion models and expanded map representations.

Abstract

In this work, we present DiPPeR, a novel and fast 2D path planning framework for quadrupedal locomotion, leveraging diffusion-driven techniques. Our contributions include a scalable dataset generator for map images and corresponding trajectories, an image-conditioned diffusion planner for mobile robots, and a training/inference pipeline employing CNNs. We validate our approach in several mazes, as well as in real-world deployment scenarios on Boston Dynamic's Spot and Unitree's Go1 robots. DiPPeR performs on average 23 times faster for trajectory generation against both search based and data driven path planning algorithms with an average of 87% consistency in producing feasible paths of various length in maps of variable size, and obstacle structure. Website: https://rpl-cs-ucl.github.io/DiPPeR

DiPPeR: Diffusion-based 2D Path Planner applied on Legged Robots

TL;DR

maps with

trajectories each, using a horizon of

and diffusion steps

, to produce trajectories

. Inference achieves about

s per trajectory and an average feasibility of

, outperforming

, Neural

, and ViT-

by roughly

on varying map sizes and obstacle structures; real-world deployment on Spot and Go1 demonstrates platform-agnostic integration with ROS navigation and local planners. The work highlights both the practical potential and current limits, notably dependence on horizon estimation, with future directions including transformer-based diffusion models and expanded map representations.

Abstract

Paper Structure (17 sections, 4 equations, 9 figures, 2 tables)

This paper contains 17 sections, 4 equations, 9 figures, 2 tables.

INTRODUCTION
RELATED WORK
Classical Path Planning
Data-Driven Path Planning
Diffusion for Path Planning
Preliminaries
Diffusion
METHOD
Data generation
Map generation
Path generation
Training Framework
RESULTS
Inference Pipeline
Simulation Results
...and 2 more sections

Figures (9)

Figure 1: Illustration of DiPPeR global path generation process.
Figure 2: DiPPeR - Image Conditioned Diffusion Training Pipeline: A Map Image Observations sample $O$ is fed to the ResNet-18 Visual Encoder and converted to latent embeddings $o$. The $x$ and $y$ of the start and goal positions are also added as part of $o$. Noise $\epsilon^{k}$ sampled from the prior Gaussian Distribution is added to the trajectory instance $A_{t}$. The noisy sample is passed as an input to the diffusion network $\epsilon_{\theta}$ and is conditioned by $o$. The network $\epsilon_{\theta}$ takes the form of a CNN and it outputs the denoised action $A^{0}$.
Figure 3: Denoising diffusion steps ($k=1000$, path$_{l}=200$) to generate a path from noisy samples.
Figure 4: Generated samples from the dataset: \ref{['subfig:map']}) examples of $100\times100$ random solvable maps and \ref{['subfig:trajectory']}) examples of trajectories, generated through $A^{*}$.
Figure 5: Validating DiPPeR's performance and generalization. A random point is selected for the start and goal position on the provided map. Maps a) and b) are part of the validation dataset to validate the performance of the network in connecting the starting and end points while also avoiding the obstacles. Maps c),d) and e) are used to test the ability of the network to generalize to different out-of-distribution environments of varying scale, color and obstacle structure.
...and 4 more figures

DiPPeR: Diffusion-based 2D Path Planner applied on Legged Robots

TL;DR

Abstract

DiPPeR: Diffusion-based 2D Path Planner applied on Legged Robots

Authors

TL;DR

Abstract

Table of Contents

Figures (9)