Table of Contents
Fetching ...

DreamFlow: Local Navigation Beyond Observation via Conditional Flow Matching in the Latent Space

Jiwon Park, Dongkyu Lee, I Made Aswin Nahrendra, Jaeyoung Lim, Hyun Myung

TL;DR

DreamFlow is proposed, a DRL-based local navigation framework that extends the robot's perceptual horizon through conditional flow matching(CFM) and enables the navigation policy to predict unobserved environmental features and proactively avoid potential local minima.

Abstract

Local navigation in cluttered environments often suffers from dense obstacles and frequent local minima. Conventional local planners rely on heuristics and are prone to failure, while deep reinforcement learning(DRL)based approaches provide adaptability but are constrained by limited onboard sensing. These limitations lead to navigation failures because the robot cannot perceive structures outside its field of view. In this paper, we propose DreamFlow, a DRL-based local navigation framework that extends the robot's perceptual horizon through conditional flow matching(CFM). The proposed CFM based prediction module learns probabilistic mapping between local height map latent representation and broader spatial representation conditioned on navigation context. This enables the navigation policy to predict unobserved environmental features and proactively avoid potential local minima. Experimental results demonstrate that DreamFlow outperforms existing methods in terms of latent prediction accuracy and navigation performance in simulation. The proposed method was further validated in cluttered real world environments with a quadrupedal robot. The project page is available at https://dreamflow-icra.github.io.

DreamFlow: Local Navigation Beyond Observation via Conditional Flow Matching in the Latent Space

TL;DR

DreamFlow is proposed, a DRL-based local navigation framework that extends the robot's perceptual horizon through conditional flow matching(CFM) and enables the navigation policy to predict unobserved environmental features and proactively avoid potential local minima.

Abstract

Local navigation in cluttered environments often suffers from dense obstacles and frequent local minima. Conventional local planners rely on heuristics and are prone to failure, while deep reinforcement learning(DRL)based approaches provide adaptability but are constrained by limited onboard sensing. These limitations lead to navigation failures because the robot cannot perceive structures outside its field of view. In this paper, we propose DreamFlow, a DRL-based local navigation framework that extends the robot's perceptual horizon through conditional flow matching(CFM). The proposed CFM based prediction module learns probabilistic mapping between local height map latent representation and broader spatial representation conditioned on navigation context. This enables the navigation policy to predict unobserved environmental features and proactively avoid potential local minima. Experimental results demonstrate that DreamFlow outperforms existing methods in terms of latent prediction accuracy and navigation performance in simulation. The proposed method was further validated in cluttered real world environments with a quadrupedal robot. The project page is available at https://dreamflow-icra.github.io.
Paper Structure (21 sections, 8 equations, 6 figures, 2 tables)

This paper contains 21 sections, 8 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: An illustration of local navigation to the goal point (yellow star) in a cluttered environment with obstacles. (a) Local navigation with only a limited onboard sensing range (blue box ed region) may lead to local minima when obstacles lie beyond the sensor range. (b) DreamFlow enables the robot to find optimal actions within the observed sensing range (blue box ed region) by leveraging CFM-based prediction of extended latent representations (green box ed region). The red and green lines indicate the executed trajectories, respectively.
  • Figure 2: The overall framework of DreamFlow. During deployment, the local height map $\mathbf{o}^\mathrm{e}_t$ (blue dots, indicating observable terrain) is encoded into the local latent vector$\mathbf{z}^\text{e}_t$. Subsequently, the pre-trained CFM predicts the extended latent vector$\mathbf{\hat{z}}^\mathrm{e}_t$ conditioned on the context vector$\mathbf{c}_t$ encoded from the proprioceptive observation s$\mathbf{o}^\mathrm{p}_t$. The predicted latent vector$\mathbf{\hat{z}}^\mathrm{e}_t$ represents information from the extended height map region (green dots, indicating terrain beyond sensor range). The navigation policy $\mathbf{\pi}_{\text{nav}}$ takes the concatenated $\mathbf{c}_t$ and $\mathbf{\hat{z}}^\mathrm{e}_t$ as input to produce the high-level velocity action $\mathbf{a}_t$. The locomotion policy then generates the low-level joint actions to control the robot. When training the navigation policy, the privileged states $\mathbf{s}^\mathrm{p}_t$ and the extended latent vector$\mathbf{z}^\text{E}_t$ from the extended height map $\mathbf{o}^\mathrm{E}_t$ are used to train the critic network.
  • Figure 3: The CFM training pipeline. The training dataset is collected using pre-trained height map encoders ($\mathcal{H}^\mathrm{e}$ and $\mathcal{H}^\mathrm{E}$) and a proprioceptive encoder (CENet). A local height map $\mathbf{o}^\mathrm{e}_t$ (blue boxed region) is encoded into a local latent representation $\mathbf{z}^\text{e}_t$, and a privileged extended height map $\mathbf{o}^\mathrm{E}_t$ (green boxed region) is encoded into an extended latent representation $\mathbf{z}^\text{E}_t$. During training, CFM learns a velocity field $v_{\theta}(\tau, \mathbf{c}_t, \mathbf{z})$, conditioned on the contextual vector $\mathbf{c}_t$, that transports $\mathbf{z}^\text{e}_t$ at $\tau_0$ towards $\mathbf{z}^\text{E}_t$ at $\tau_1$. Consequently, CFM maps $\mathbf{z}^\text{e}_t$ (blue dots) to the predicted latent vector$\mathbf{\hat{z}}^\mathrm{e}_t$ (red dots), which align s with $\mathbf{z}^\text{E}_t$(greed dots) in the latent space.
  • Figure 4: (a) The simulation environments for evaluating navigation performance. The Maze environment consists of multiple corners leading to the goal, while the Hallway environment consists of narrow passages. (b) Our quadruped robot, used in real-world experiments, is equipped with a processing unit and a sensor module mounted on its body.
  • Figure 5: Comparison of local navigation performance across three simulation scenarios: (a) Maze (Easy), (b) Maze (Hard), and (c) Hallway. Robot trajectories are depicted from a top view and overlaid on a height map, starting from initial positions (yellow markers) and ending at goal locations (red stars). Trajectories are color-coded according to time progression, with a gradient from dark blue (start) to cyan (goal). DreamFlow demonstrates smoother trajectories, with better obstacle avoidance and more efficient path selection (highlighted by the red box). In contrast, other methods show frequent obstacle contacts and often get stuck in local minima.
  • ...and 1 more figures