Table of Contents
Fetching ...

Adaptive Diffusion Terrain Generator for Autonomous Uneven Terrain Navigation

Youwei Yu, Junhong Xu, Lantao Liu

TL;DR

This work introduces the Adaptive Diffusion Terrain Generator (ADTG), a novel method that leverages Denoising Diffusion Probabilistic Models to dynamically expand existing training environments by adding more diverse and complex terrains adaptive to the current policy.

Abstract

Model-free reinforcement learning has emerged as a powerful method for developing robust robot control policies capable of navigating through complex and unstructured terrains. The effectiveness of these methods hinges on two essential elements: (1) the use of massively parallel physics simulations to expedite policy training, and (2) an environment generator tasked with crafting sufficiently challenging yet attainable terrains to facilitate continuous policy improvement. Existing methods of environment generation often rely on heuristics constrained by a set of parameters, limiting the diversity and realism. In this work, we introduce the Adaptive Diffusion Terrain Generator (ADTG), a novel method that leverages Denoising Diffusion Probabilistic Models to dynamically expand existing training environments by adding more diverse and complex terrains adaptive to the current policy. ADTG guides the diffusion model's generation process through initial noise optimization, blending noise-corrupted terrains from existing training environments weighted by the policy's performance in each corresponding environment. By manipulating the noise corruption level, ADTG seamlessly transitions between generating similar terrains for policy fine-tuning and novel ones to expand training diversity. Our experiments show that the policy trained by ADTG outperforms both procedural generated and natural environments, along with popular navigation methods.

Adaptive Diffusion Terrain Generator for Autonomous Uneven Terrain Navigation

TL;DR

This work introduces the Adaptive Diffusion Terrain Generator (ADTG), a novel method that leverages Denoising Diffusion Probabilistic Models to dynamically expand existing training environments by adding more diverse and complex terrains adaptive to the current policy.

Abstract

Model-free reinforcement learning has emerged as a powerful method for developing robust robot control policies capable of navigating through complex and unstructured terrains. The effectiveness of these methods hinges on two essential elements: (1) the use of massively parallel physics simulations to expedite policy training, and (2) an environment generator tasked with crafting sufficiently challenging yet attainable terrains to facilitate continuous policy improvement. Existing methods of environment generation often rely on heuristics constrained by a set of parameters, limiting the diversity and realism. In this work, we introduce the Adaptive Diffusion Terrain Generator (ADTG), a novel method that leverages Denoising Diffusion Probabilistic Models to dynamically expand existing training environments by adding more diverse and complex terrains adaptive to the current policy. ADTG guides the diffusion model's generation process through initial noise optimization, blending noise-corrupted terrains from existing training environments weighted by the policy's performance in each corresponding environment. By manipulating the noise corruption level, ADTG seamlessly transitions between generating similar terrains for policy fine-tuning and novel ones to expand training diversity. Our experiments show that the policy trained by ADTG outperforms both procedural generated and natural environments, along with popular navigation methods.

Paper Structure

This paper contains 13 sections, 3 equations, 5 figures, 1 table, 1 algorithm.

Figures (5)

  • Figure 1: (a) Each row shows denoised terrains at different forward steps $K$, and each column blends terrains of varying difficulty using weighting factor $w$. Decreasing $w_1$ (easier terrain) and increasing $w_2$ raises difficulty. As $K$ increases, terrains show more novelty while maintaining difficulty. (b) Variance of a typical denoising diffusion process with a zoom-in view (left).
  • Figure 2: Framework with our Adaptive Diffusion Terrain Generation (ADTG) and Policy Distillation. Model-free RL trains privileged policy on ADTG-generated terrains. The privileged policy is then distilled into the deployment (Learner) policy using data aggregation. Iterative training and terrain generation through ADTG enhance the deployment policy's generalization.
  • Figure 3: The comparison of the normalized return among our proposed ADTG, the baselines, and ablation methods.
  • Figure 4: The left panel shows nine challenging environments, the middle our platform, and the right Dune-hard, where our method outperformed others in navigating ravines.
  • Figure : ACRL with Adaptive Diffusion Terrain Generator