Table of Contents
Fetching ...

Extendable Planning via Multiscale Diffusion

Chang Chen, Hany Hamed, Doojin Baek, Taegu Kang, Samyeul Noh, Yoshua Bengio, Sungjin Ahn

TL;DR

The paper addresses extendable long-horizon planning for diffusion-based planners, which are typically constrained by training trajectory lengths. It introduces a two-phase framework: Progressive Trajectory Extension (PTE) to synthesize much longer trajectories via multi-round compositional stitching, and Hierarchical Multiscale Diffuser (HM-Diffuser) to enable efficient planning across temporal scales, aided by Adaptive Plan Pondering (APP) and a Recursive HM-Diffuser. The authors also present the Plan Extendable Trajectory Suite (PETS) benchmark and demonstrate that HM-Diffuser-X trained on PTE-extended data achieves strong performance across Extendable Maze2D, Extendable Franka Kitchen, and Extendable Gym-MuJoCo, with ablations confirming the benefits of multiscale planning and data augmentation. This work advances scalable long-horizon decision-making in diffusion-based planning and shows promise for offline RL settings, though it notes limitations such as stitching quality, lack of visual inputs, and the need for test-time search refinements.

Abstract

Long-horizon planning is crucial in complex environments, but diffusion-based planners like Diffuser are limited by the trajectory lengths observed during training. This creates a dilemma: long trajectories are needed for effective planning, yet they degrade model performance. In this paper, we introduce this extendable long-horizon planning challenge and propose a two-phase solution. First, Progressive Trajectory Extension incrementally constructs longer trajectories through multi-round compositional stitching. Second, the Hierarchical Multiscale Diffuser enables efficient training and inference over long horizons by reasoning across temporal scales. To avoid the need for multiple separate models, we propose Adaptive Plan Pondering and the Recursive HM-Diffuser, which unify hierarchical planning within a single model. Experiments show our approach yields strong performance gains, advancing scalable and efficient decision-making over long-horizons.

Extendable Planning via Multiscale Diffusion

TL;DR

The paper addresses extendable long-horizon planning for diffusion-based planners, which are typically constrained by training trajectory lengths. It introduces a two-phase framework: Progressive Trajectory Extension (PTE) to synthesize much longer trajectories via multi-round compositional stitching, and Hierarchical Multiscale Diffuser (HM-Diffuser) to enable efficient planning across temporal scales, aided by Adaptive Plan Pondering (APP) and a Recursive HM-Diffuser. The authors also present the Plan Extendable Trajectory Suite (PETS) benchmark and demonstrate that HM-Diffuser-X trained on PTE-extended data achieves strong performance across Extendable Maze2D, Extendable Franka Kitchen, and Extendable Gym-MuJoCo, with ablations confirming the benefits of multiscale planning and data augmentation. This work advances scalable long-horizon decision-making in diffusion-based planning and shows promise for offline RL settings, though it notes limitations such as stitching quality, lack of visual inputs, and the need for test-time search refinements.

Abstract

Long-horizon planning is crucial in complex environments, but diffusion-based planners like Diffuser are limited by the trajectory lengths observed during training. This creates a dilemma: long trajectories are needed for effective planning, yet they degrade model performance. In this paper, we introduce this extendable long-horizon planning challenge and propose a two-phase solution. First, Progressive Trajectory Extension incrementally constructs longer trajectories through multi-round compositional stitching. Second, the Hierarchical Multiscale Diffuser enables efficient training and inference over long horizons by reasoning across temporal scales. To avoid the need for multiple separate models, we propose Adaptive Plan Pondering and the Recursive HM-Diffuser, which unify hierarchical planning within a single model. Experiments show our approach yields strong performance gains, advancing scalable and efficient decision-making over long-horizons.

Paper Structure

This paper contains 28 sections, 4 equations, 21 figures, 10 tables, 5 algorithms.

Figures (21)

  • Figure 1: Progressive Trajectory Extension process.(Left) Conceptual illustration of PTE: (i) A source trajectory (blue), a bridge (green), and target candidates (black) are sampled. (ii) Candidates are filtered by proximity to the bridge, and a valid target (orange) is selected. (iii) The extended trajectory is formed by bridging the source to the chosen target. (Right) Visualization of this process in the Franka Kitchen environment.
  • Figure 2: Recursive Hierarchical Multiscale Diffuser (HM-Diffuser). Given a start state S and goal state G, the Adaptive Plan Pondering (APP) module selects an appropriate starting level $\ell\text{+}1 = f_\phi(S, G)$. The level-conditioned diffuser $p_\theta( \tau | \ell\text{+}1, S, G)$ generates a sequence of high-level subgoals, which are recursively refined by the same model at lower levels. Each subgoal pair defines a start–goal pair for the next level, progressively constructing a complete trajectory from coarse to fine resolution.
  • Figure 4: Adaptive planning capability of HM-Diffuser. (a) Multiscale planning trajectories generated by HM-Diffuser across different levels ($\ell=1,2,3$). The orange box highlights the level selected by the depth predictor, which yields the most efficient plan. (b) Comparison of long-horizon planning results from Diffuser, HD, and HM-Diffuser. While Diffuser and HD exhibit detours or suboptimal paths, HM-Diffuser generates a more direct and optimal trajectory.
  • Figure 5: Trajectory quality. To fairly compare trajectory quality as length increases with stitching, we use normalized return (total return divided by its length). Although this metric tends to penalize longer trajectories, it still improves across PTE rounds, indicating that PTE progressively generates higher-quality data.
  • Figure : (a) Mean Trajectory Length
  • ...and 16 more figures