Table of Contents
Fetching ...

Simple Hierarchical Planning with Diffusion

Chang Chen, Fei Deng, Kenji Kawaguchi, Caglar Gulcehre, Sungjin Ahn

TL;DR

This paper addresses long-horizon decision-making with diffusion-based planning by introducing Hierarchical Diffuser (HD), a two-diffuser framework that marries hierarchical planning with diffusion models. The high-level diffuser generates jumpy subgoals at intervals $K$ over horizon $H$, while a low-level diffuser refines dense trajectories between subgoals, enabling efficient planning and better data coverage. A density-enhanced variant with dense actions (SD-DA / HD-DA) improves return prediction, and a theoretical analysis provides a generalization bound illustrating tradeoffs between $K$ and kernel size. Empirically, HD achieves state-of-the-art performance and faster planning on long-horizon offline RL benchmarks (Maze2D, MuJoCo, AntMaze) and demonstrates superior compositional generalization on out-of-distribution tasks, highlighting practical impact for scalable, data-efficient planning with diffusion models.

Abstract

Diffusion-based generative methods have proven effective in modeling trajectories with offline datasets. However, they often face computational challenges and can falter in generalization, especially in capturing temporal abstractions for long-horizon tasks. To overcome this, we introduce the Hierarchical Diffuser, a simple, fast, yet surprisingly effective planning method combining the advantages of hierarchical and diffusion-based planning. Our model adopts a "jumpy" planning strategy at the higher level, which allows it to have a larger receptive field but at a lower computational cost -- a crucial factor for diffusion-based planning methods, as we have empirically verified. Additionally, the jumpy sub-goals guide our low-level planner, facilitating a fine-tuning stage and further improving our approach's effectiveness. We conducted empirical evaluations on standard offline reinforcement learning benchmarks, demonstrating our method's superior performance and efficiency in terms of training and planning speed compared to the non-hierarchical Diffuser as well as other hierarchical planning methods. Moreover, we explore our model's generalization capability, particularly on how our method improves generalization capabilities on compositional out-of-distribution tasks.

Simple Hierarchical Planning with Diffusion

TL;DR

This paper addresses long-horizon decision-making with diffusion-based planning by introducing Hierarchical Diffuser (HD), a two-diffuser framework that marries hierarchical planning with diffusion models. The high-level diffuser generates jumpy subgoals at intervals over horizon , while a low-level diffuser refines dense trajectories between subgoals, enabling efficient planning and better data coverage. A density-enhanced variant with dense actions (SD-DA / HD-DA) improves return prediction, and a theoretical analysis provides a generalization bound illustrating tradeoffs between and kernel size. Empirically, HD achieves state-of-the-art performance and faster planning on long-horizon offline RL benchmarks (Maze2D, MuJoCo, AntMaze) and demonstrates superior compositional generalization on out-of-distribution tasks, highlighting practical impact for scalable, data-efficient planning with diffusion models.

Abstract

Diffusion-based generative methods have proven effective in modeling trajectories with offline datasets. However, they often face computational challenges and can falter in generalization, especially in capturing temporal abstractions for long-horizon tasks. To overcome this, we introduce the Hierarchical Diffuser, a simple, fast, yet surprisingly effective planning method combining the advantages of hierarchical and diffusion-based planning. Our model adopts a "jumpy" planning strategy at the higher level, which allows it to have a larger receptive field but at a lower computational cost -- a crucial factor for diffusion-based planning methods, as we have empirically verified. Additionally, the jumpy sub-goals guide our low-level planner, facilitating a fine-tuning stage and further improving our approach's effectiveness. We conducted empirical evaluations on standard offline reinforcement learning benchmarks, demonstrating our method's superior performance and efficiency in terms of training and planning speed compared to the non-hierarchical Diffuser as well as other hierarchical planning methods. Moreover, we explore our model's generalization capability, particularly on how our method improves generalization capabilities on compositional out-of-distribution tasks.
Paper Structure (28 sections, 2 theorems, 34 equations, 4 figures, 9 tables, 3 algorithms)

This paper contains 28 sections, 2 theorems, 34 equations, 4 figures, 9 tables, 3 algorithms.

Key Result

Theorem 1

For any $\delta>0$, with probability at least $1-\delta$,

Figures (4)

  • Figure 1: Test and train-time differences between Diffuser models. Hierarchical Diffuser (HD) is a general hierarchical diffusion-based planning framework. Unlike the Diffuser's training process (A, left), the HD's training phase reorganizes the training trajectory into two components: a sub-goal trajectory and dense segments. These components are then utilized to train the high-level and low-level denoising networks in parallel (B, left). During the testing phase, in contrast to Diffuser (A, right), HD initially generates a high-level plan consisted of sub-goals, which is subsequently refined through the low-level planner (B, right).
  • Figure 2: Impact of Kernel Size. Results of the impact of kernel size on performance of Diffuser in offline RL indicates that reasonably enlarging kernel size can improves the performance.
  • Figure 3: Coverage of Data Distribution. Empirically, we observed that Diffuser exhibits insufficient coverage of the dataset distribution. We illustrate this with an example featuring three distinct paths traversing from the start to the goal state. While Diffuser struggles to capture these divergent paths, both our method and Diffuser with an increased receptive field successfully recover this distribution.
  • Figure 4: Large Kernel Size Hurts the OOD Generalization. Increasing kernel size generally improves the offline RL performance of Diffuser model. However, when a large receptive field and compositional out-of-distribution (OOD) generalization are both required, Diffuser models offer no simple solution. We demonstrate this with the sampled plans from both the standard Difuser and a Difuser with varied kernel sizes (KS). None of them can come up with an optimal plan by stiching training segments together. Conversely, our proposed Hierarchical Diffuser (HD) posseses both a large receptive field and the flexibility needed of compositional OOD tasks.

Theorems & Definitions (2)

  • Theorem 1
  • Proposition 1