World Models for Autonomous Navigation of Terrestrial Robots from LIDAR Observations
Raul Steinmetz, Fabio Demo Rosa, Victor Augusto Kich, Jair Augusto Bottega, Ricardo Bedin Grando, Daniel Fernando Tello Gamarra
TL;DR
The paper tackles the challenge of learning autonomous navigation for terrestrial robots from high-dimensional LIDAR data, where model-free DRL methods often struggle with sample inefficiency. It presents a model-based framework built on DreamerV3 that uses an MLP-VAE to encode 360-degree LIDAR readings into latent states feeding a world model for imagined rollouts and policy optimization. Empirical results on TurtleBot3 simulations show DreamerV3 outperforming SAC, DDPG, and TD3, achieving a 100% success rate with complete LIDAR data, while baselines fail to scale under high-dimensional inputs. The work demonstrates the value of predictive world models and latent representations for robust, efficient navigation and provides open-source code for reproducibility and further research.
Abstract
Autonomous navigation of terrestrial robots using Reinforcement Learning (RL) from LIDAR observations remains challenging due to the high dimensionality of sensor data and the sample inefficiency of model-free approaches. Conventional policy networks struggle to process full-resolution LIDAR inputs, forcing prior works to rely on simplified observations that reduce spatial awareness and navigation robustness. This paper presents a novel model-based RL framework built on top of the DreamerV3 algorithm, integrating a Multi-Layer Perceptron Variational Autoencoder (MLP-VAE) within a world model to encode high-dimensional LIDAR readings into compact latent representations. These latent features, combined with a learned dynamics predictor, enable efficient imagination-based policy optimization. Experiments on simulated TurtleBot3 navigation tasks demonstrate that the proposed architecture achieves faster convergence and higher success rate compared to model-free baselines such as SAC, DDPG, and TD3. It is worth emphasizing that the DreamerV3-based agent attains a 100% success rate across all evaluated environments when using the full dataset of the Turtlebot3 LIDAR (360 readings), while model-free methods plateaued below 85%. These findings demonstrate that integrating predictive world models with learned latent representations enables more efficient and robust navigation from high-dimensional sensory data.
