Table of Contents
Fetching ...

Resisting Stochastic Risks in Diffusion Planners with the Trajectory Aggregation Tree

Lang Feng, Pengjie Gu, Bo An, Gang Pan

TL;DR

This work tackles the stochastic risk of diffusion-based planners producing infeasible trajectories by introducing the Trajectory Aggregation Tree (TAT), a training-free mechanism that aggregates past and current trajectories into a dynamic tree and makes decisions from high-weight, aggregated nodes. The authors derive an artifact-probability bound that decreases as the number of trajectories increases, providing a theoretical guarantee of reliability and resilience to artifacts. Empirically, TAT is deployed as a plug-and-play enhancement on existing diffusion planners, yielding consistent performance gains and more than 3x faster planning across Maze2D, block stacking, and MuJoCo locomotion tasks. The results suggest that diffusion-based planning can be made safer and faster in real-time settings without additional training, with potential extensions to richer weighting schemes and spatial considerations.

Abstract

Diffusion planners have shown promise in handling long-horizon and sparse-reward tasks due to the non-autoregressive plan generation. However, their inherent stochastic risk of generating infeasible trajectories presents significant challenges to their reliability and stability. We introduce a novel approach, the Trajectory Aggregation Tree (TAT), to address this issue in diffusion planners. Compared to prior methods that rely solely on raw trajectory predictions, TAT aggregates information from both historical and current trajectories, forming a dynamic tree-like structure. Each trajectory is conceptualized as a branch and individual states as nodes. As the structure evolves with the integration of new trajectories, unreliable states are marginalized, and the most impactful nodes are prioritized for decision-making. TAT can be deployed without modifying the original training and sampling pipelines of diffusion planners, making it a training-free, ready-to-deploy solution. We provide both theoretical analysis and empirical evidence to support TAT's effectiveness. Our results highlight its remarkable ability to resist the risk from unreliable trajectories, guarantee the performance boosting of diffusion planners in $100\%$ of tasks, and exhibit an appreciable tolerance margin for sample quality, thereby enabling planning with a more than $3\times$ acceleration.

Resisting Stochastic Risks in Diffusion Planners with the Trajectory Aggregation Tree

TL;DR

This work tackles the stochastic risk of diffusion-based planners producing infeasible trajectories by introducing the Trajectory Aggregation Tree (TAT), a training-free mechanism that aggregates past and current trajectories into a dynamic tree and makes decisions from high-weight, aggregated nodes. The authors derive an artifact-probability bound that decreases as the number of trajectories increases, providing a theoretical guarantee of reliability and resilience to artifacts. Empirically, TAT is deployed as a plug-and-play enhancement on existing diffusion planners, yielding consistent performance gains and more than 3x faster planning across Maze2D, block stacking, and MuJoCo locomotion tasks. The results suggest that diffusion-based planning can be made safer and faster in real-time settings without additional training, with potential extensions to richer weighting schemes and spatial considerations.

Abstract

Diffusion planners have shown promise in handling long-horizon and sparse-reward tasks due to the non-autoregressive plan generation. However, their inherent stochastic risk of generating infeasible trajectories presents significant challenges to their reliability and stability. We introduce a novel approach, the Trajectory Aggregation Tree (TAT), to address this issue in diffusion planners. Compared to prior methods that rely solely on raw trajectory predictions, TAT aggregates information from both historical and current trajectories, forming a dynamic tree-like structure. Each trajectory is conceptualized as a branch and individual states as nodes. As the structure evolves with the integration of new trajectories, unreliable states are marginalized, and the most impactful nodes are prioritized for decision-making. TAT can be deployed without modifying the original training and sampling pipelines of diffusion planners, making it a training-free, ready-to-deploy solution. We provide both theoretical analysis and empirical evidence to support TAT's effectiveness. Our results highlight its remarkable ability to resist the risk from unreliable trajectories, guarantee the performance boosting of diffusion planners in of tasks, and exhibit an appreciable tolerance margin for sample quality, thereby enabling planning with a more than acceleration.
Paper Structure (33 sections, 2 theorems, 29 equations, 7 figures, 7 tables, 3 algorithms)

This paper contains 33 sections, 2 theorems, 29 equations, 7 figures, 7 tables, 3 algorithms.

Key Result

Proposition 4.3

In TAT's planning, each action is a product of aggregated information from $n$ trajectories. Consider the case of $\lambda=1$, the probability of TAT choosing the artifact can be bounded by where ${\rm erf}(\cdot)$ denotes the error function.

Figures (7)

  • Figure 1: Overview of TAT. (a) Diffusion pipeline. Given the environment state, TAT utilizes the original diffusion planner to sample trajectories. (b) TAT pipeline. The tree incorporates past and current trajectories to construct a comprehensive experience of future states. The darker the red color, the higher the weight the node has. The tree grows dynamically and keeps in sync with the environment by pruning branches (the light transparent gray part).
  • Figure 2: Planning with TAT. The darker the red color, the higher the weight the node has. (a) Weight allocation to each state in the sampled trajectory $\bm{\tau}$. (b) Merging procedure starting from the root node and sequentially traversing the states in $\bm{\tau}$ to assign weights to the tree nodes. (c) Expanding the tree with the sub-trajectory $\bm{\tau}^{\rm latter}$ that the tree has not yet visited. (d) Acting by selecting the node with the highest weight among the child nodes for decision-making. (e) Pruning of the tree to synchronize with the environment.
  • Figure 3: The dynamic trend of the upper bound $P^{\text{upper}}$ of $P^{\text{artifact}}(n; \varUpsilon)$ with respect to $\varepsilon$ and $n$. As $n$ increases, it monotonically decreases and becomes less sensitive to the variation of $\varepsilon$.
  • Figure 4: The planning process of $\text{Diffuser}^{\varUpsilon}$ in open-loop (a) and closed-loop (b) manners. denotes the starting position and denotes the goal position. The red lines on the top row are the plans of the baseline Diffuser. It is evident that Diffuser generates artifacts at the beginning (a) or introduces new artifacts in subsequent steps (b). $\text{Diffuser}^{\varUpsilon}$ can dynamically filter out and prune these inefficient branches indicated by the light transparent segments. The blue lines on the below row are the results of $\text{Diffuser}^{\varUpsilon}$.
  • Figure 7.1: Visual comparison of Diffuser and $\text{Diffuser}^{\varUpsilon}$ in the presence of artifacts in the Maze2D-Large environments. denotes the starting position and denotes the goal position.
  • ...and 2 more figures

Theorems & Definitions (6)

  • Definition 4.1: Artifact Probability
  • Definition 4.2: Number of Trajectories
  • Proposition 4.3
  • Corollary 4.4
  • proof
  • proof