Resisting Stochastic Risks in Diffusion Planners with the Trajectory Aggregation Tree
Lang Feng, Pengjie Gu, Bo An, Gang Pan
TL;DR
This work tackles the stochastic risk of diffusion-based planners producing infeasible trajectories by introducing the Trajectory Aggregation Tree (TAT), a training-free mechanism that aggregates past and current trajectories into a dynamic tree and makes decisions from high-weight, aggregated nodes. The authors derive an artifact-probability bound that decreases as the number of trajectories increases, providing a theoretical guarantee of reliability and resilience to artifacts. Empirically, TAT is deployed as a plug-and-play enhancement on existing diffusion planners, yielding consistent performance gains and more than 3x faster planning across Maze2D, block stacking, and MuJoCo locomotion tasks. The results suggest that diffusion-based planning can be made safer and faster in real-time settings without additional training, with potential extensions to richer weighting schemes and spatial considerations.
Abstract
Diffusion planners have shown promise in handling long-horizon and sparse-reward tasks due to the non-autoregressive plan generation. However, their inherent stochastic risk of generating infeasible trajectories presents significant challenges to their reliability and stability. We introduce a novel approach, the Trajectory Aggregation Tree (TAT), to address this issue in diffusion planners. Compared to prior methods that rely solely on raw trajectory predictions, TAT aggregates information from both historical and current trajectories, forming a dynamic tree-like structure. Each trajectory is conceptualized as a branch and individual states as nodes. As the structure evolves with the integration of new trajectories, unreliable states are marginalized, and the most impactful nodes are prioritized for decision-making. TAT can be deployed without modifying the original training and sampling pipelines of diffusion planners, making it a training-free, ready-to-deploy solution. We provide both theoretical analysis and empirical evidence to support TAT's effectiveness. Our results highlight its remarkable ability to resist the risk from unreliable trajectories, guarantee the performance boosting of diffusion planners in $100\%$ of tasks, and exhibit an appreciable tolerance margin for sample quality, thereby enabling planning with a more than $3\times$ acceleration.
