Adaptive Planning with Generative Models under Uncertainty

Pascal Jutras-Dubé; Ruqi Zhang; Aniket Bera

Adaptive Planning with Generative Models under Uncertainty

Pascal Jutras-Dubé, Ruqi Zhang, Aniket Bera

TL;DR

The paper addresses the high computational cost of planning with generative models in real-time control by introducing an adaptive policy that leverages long-horizon state predictions and predictive uncertainty. A Deep Ensemble of inverse dynamics action models estimates uncertainty $u_t$, triggering replanning only when necessary via a tunable threshold $\\delta$, allowing multiple actions to be executed between plan updates. Experiments on OpenAI Gym locomotion tasks demonstrate substantial speedups (up to ≈50x) and up to $90\\%$–$93\\%$ reductions in generative-model evaluations with minimal impact on rewards, using the same diffusion-based state predictor and a lightweight action model. The approach is model-agnostic and scalable across generative-model architectures, enabling faster, reliable decision-making in offline RL and potential real-time deployment in robotics and autonomous systems.

Abstract

Planning with generative models has emerged as an effective decision-making paradigm across a wide range of domains, including reinforcement learning and autonomous navigation. While continuous replanning at each timestep might seem intuitive because it allows decisions to be made based on the most recent environmental observations, it results in substantial computational challenges, primarily due to the complexity of the generative model's underlying deep learning architecture. Our work addresses this challenge by introducing a simple adaptive planning policy that leverages the generative model's ability to predict long-horizon state trajectories, enabling the execution of multiple actions consecutively without the need for immediate replanning. We propose to use the predictive uncertainty derived from a Deep Ensemble of inverse dynamics models to dynamically adjust the intervals between planning sessions. In our experiments conducted on locomotion tasks within the OpenAI Gym framework, we demonstrate that our adaptive planning policy allows for a reduction in replanning frequency to only about 10% of the steps without compromising the performance. Our results underscore the potential of generative modeling as an efficient and effective tool for decision-making.

Adaptive Planning with Generative Models under Uncertainty

TL;DR

, triggering replanning only when necessary via a tunable threshold

, allowing multiple actions to be executed between plan updates. Experiments on OpenAI Gym locomotion tasks demonstrate substantial speedups (up to ≈50x) and up to

–

reductions in generative-model evaluations with minimal impact on rewards, using the same diffusion-based state predictor and a lightweight action model. The approach is model-agnostic and scalable across generative-model architectures, enabling faster, reliable decision-making in offline RL and potential real-time deployment in robotics and autonomous systems.

Abstract

Paper Structure (16 sections, 11 equations, 3 figures, 3 tables, 1 algorithm)

This paper contains 16 sections, 11 equations, 3 figures, 3 tables, 1 algorithm.

Introduction
Related Work
Generative Modeling for Decision-Making
Improving Sampling Speed
Estimating Uncertainty in Neural Networks
Background
Problem Description
Generative Modeling for States Prediction
Action Prediction
Adaptive Decision-Making under Uncertainty
Adaptive Policy
Deep Ensembles for Predictive Uncertainty Estimation
Experiments
Experimental Setup
Ablation Study
...and 1 more sections

Figures (3)

Figure 1: The generative model generates a trajectory of states and the action model computes the initial action. The policy continuously predicts and executes subsequent actions as long as the uncertainty remains below a predefined threshold.
Figure 2: Ensemble Action completes 1000 steps in under 25 seconds, while the Decision Diffuser takes over 23 minutes, resulting in a 55x speedup.
Figure 3: Impact of varying $\delta$ on rewards for the Ensemble Action model in the Hopper Medium-Expert dataset, with specific $\delta$ values showing saved NFEs.

Adaptive Planning with Generative Models under Uncertainty

TL;DR

Abstract

Adaptive Planning with Generative Models under Uncertainty

Authors

TL;DR

Abstract

Table of Contents

Figures (3)