Sim2Dust: Mastering Dynamic Waypoint Tracking on Granular Media

Andrej Orsula; Matthieu Geist; Miguel Olivares-Mendez; Carol Martinez

Sim2Dust: Mastering Dynamic Waypoint Tracking on Granular Media

Andrej Orsula, Matthieu Geist, Miguel Olivares-Mendez, Carol Martinez

TL;DR

This work tackles the sim-to-real gap in rover navigation on granular regolith by building a complete framework that trains policies in massively parallel, procedurally varied simulations (Space Robotics Bench) and transfers them zero-shot to a lunar-analogue rover (Leo) in LunaLab. The authors compare RL algorithms and action smoothing, showing DreamerV3 achieves superior zero-shot performance and sample efficiency, with PCG-driven diversity being crucial for generalization. They also show that high-fidelity particle tuning offers limited gains at a high computational cost, and that perceptual gaps in vision-based control pose a significant challenge. The results establish a practical workflow for robust, learning-based autonomous traversal in off-world environments, while identifying key limitations and avenues for future sensor- and perception-focused improvements.

Abstract

Reliable autonomous navigation across the unstructured terrains of distant planetary surfaces is a critical enabler for future space exploration. However, the deployment of learning-based controllers is hindered by the inherent sim-to-real gap, particularly for the complex dynamics of wheel interactions with granular media. This work presents a complete sim-to-real framework for developing and validating robust control policies for dynamic waypoint tracking on such challenging surfaces. We leverage massively parallel simulation to train reinforcement learning agents across a vast distribution of procedurally generated environments with randomized physics. These policies are then transferred zero-shot to a physical wheeled rover operating in a lunar-analogue facility. Our experiments systematically compare multiple reinforcement learning algorithms and action smoothing filters to identify the most effective combinations for real-world deployment. Crucially, we provide strong empirical evidence that agents trained with procedural diversity achieve superior zero-shot performance compared to those trained on static scenarios. We also analyze the trade-offs of fine-tuning with high-fidelity particle physics, which offers minor gains in low-speed precision at a significant computational cost. Together, these contributions establish a validated workflow for creating reliable learning-based navigation systems, marking a substantial step towards deploying autonomous robots in the final frontier.

Sim2Dust: Mastering Dynamic Waypoint Tracking on Granular Media

TL;DR

Abstract

Sim2Dust: Mastering Dynamic Waypoint Tracking on Granular Media

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (9)