Table of Contents
Fetching ...

Diffusion-based Planning with Learned Viability Filters

Nicholas Ioannidis, Daniele Reda, Setareh Cohan, Michiel van de Panne

TL;DR

This work addresses the challenge of planning with hard or implicit constraints in humanoid footstep planning under uncertainty. It introduces learned viability filters ($\mathit{VF}$) that approximate the viability kernel via a Q-function and can be trained offline or online to filter diffusion-generated plans, enabling fast, constraint-aware planning and compositional constraint handling. The approach demonstrates improved online feasibility and speed across platform traversal, hurdle negotiation, and obstacle avoidance, with online VF offering the strongest gains and competitive inference times compared to guidance-based diffusion. The proposed VF framework integrates with diffusion planning to balance offline learning and online adaptation, enabling robust, real-time planning for complex locomotion tasks and suggesting broad applicability to multi-constraint control problems.

Abstract

Diffusion models can be used as a motion planner by sampling from a distribution of possible futures. However, the samples may not satisfy hard constraints that exist only implicitly in the training data, e.g., avoiding falls or not colliding with a wall. We propose learned viability filters that efficiently predict the future success of any given plan, i.e., diffusion sample, and thereby enforce an implicit future-success constraint. Multiple viability filters can also be composed together. We demonstrate the approach on detailed footstep planning for challenging 3D human locomotion tasks, showing the effectiveness of viability filters in performing online planning and control for box-climbing, step-over walls, and obstacle avoidance. We further show that using viability filters is significantly faster than guidance-based diffusion prediction.

Diffusion-based Planning with Learned Viability Filters

TL;DR

This work addresses the challenge of planning with hard or implicit constraints in humanoid footstep planning under uncertainty. It introduces learned viability filters () that approximate the viability kernel via a Q-function and can be trained offline or online to filter diffusion-generated plans, enabling fast, constraint-aware planning and compositional constraint handling. The approach demonstrates improved online feasibility and speed across platform traversal, hurdle negotiation, and obstacle avoidance, with online VF offering the strongest gains and competitive inference times compared to guidance-based diffusion. The proposed VF framework integrates with diffusion planning to balance offline learning and online adaptation, enabling robust, real-time planning for complex locomotion tasks and suggesting broad applicability to multi-constraint control problems.

Abstract

Diffusion models can be used as a motion planner by sampling from a distribution of possible futures. However, the samples may not satisfy hard constraints that exist only implicitly in the training data, e.g., avoiding falls or not colliding with a wall. We propose learned viability filters that efficiently predict the future success of any given plan, i.e., diffusion sample, and thereby enforce an implicit future-success constraint. Multiple viability filters can also be composed together. We demonstrate the approach on detailed footstep planning for challenging 3D human locomotion tasks, showing the effectiveness of viability filters in performing online planning and control for box-climbing, step-over walls, and obstacle avoidance. We further show that using viability filters is significantly faster than guidance-based diffusion prediction.

Paper Structure

This paper contains 37 sections, 18 equations, 7 figures, 8 tables, 1 algorithm.

Figures (7)

  • Figure 1: The footstep-based planner with viability filters enables a humanoid to climb platforms, step-over hurdles, and avoid obstacles.
  • Figure 2: Toy example with unnormalized probability density functions. Learned diffusion models do not fully respect hard constraints, in particular when trained on noisy data, e.g., observations from a stochastic environment.
  • Figure 3: Inference pipeline of our compositional environment. A diffusion planner generates a set of possible plans given the diffusion state (including the character state, the height map and the waypoint). Given task-relevant information as input to each $\mathit{VF}$, each plan is evaluated by all $\mathit{VF}$s and a viability value is associated to each plan according to Eq. \ref{['eq:composition_vf']}. The plan with the highest viability is executed in the environment by the controller. Note that the inference pipeline is similar in the case of a single task, as the diffusion planner is the same for all, but the generated plans are evaluated on a single $\mathit{VF}$.
  • Figure 4: Training procedure. A humanoid controller follows procedurally-generated footstep trajectories. Each 4-footstep window is categorized as a success or failure. Successful plans are used to train the Diffusion Planner.
  • Figure 5: Training procedure for the Viability filter in the Offline (left) and Online (right) settings. In the Online case the best proposed plan from the Diffusion Planner is evaluated and stored. Transitions in the Experience Replay is used in an Online-RL fashion to update the Viability Filter
  • ...and 2 more figures