Table of Contents
Fetching ...

An Open-Source LiDAR and Monocular Off-Road Autonomous Navigation Stack

Rémi Marsal, Quentin Picard, Adrien Poiré, Sébastien Kerbourc'h, Thibault Toralba, Clément Yver, Alexandre Chapoutot, David Filliat

Abstract

Off-road autonomous navigation demands reliable 3D perception for robust obstacle detection in challenging unstructured terrain. While LiDAR is accurate, it is costly and power-intensive. Monocular depth estimation using foundation models offers a lightweight alternative, but its integration into outdoor navigation stacks remains underexplored. We present an open-source off-road navigation stack supporting both LiDAR and monocular 3D perception without task-specific training. For the monocular setup, we combine zero-shot depth prediction (Depth Anything V2) with metric depth rescaling using sparse SLAM measurements (VINS-Mono). Two key enhancements improve robustness: edge-masking to reduce obstacle hallucination and temporal smoothing to mitigate the impact of SLAM instability. The resulting point cloud is used to generate a robot-centric 2.5D elevation map for costmap-based planning. Evaluated in photorealistic simulations (Isaac Sim) and real-world unstructured environments, the monocular configuration matches high-resolution LiDAR performance in most scenarios, demonstrating that foundation-model-based monocular depth estimation is a viable LiDAR alternative for robust off-road navigation. By open-sourcing the navigation stack and the simulation environment, we provide a complete pipeline for off-road navigation as well as a reproducible benchmark. Code available at https://github.com/LARIAD/Offroad-Nav.

An Open-Source LiDAR and Monocular Off-Road Autonomous Navigation Stack

Abstract

Off-road autonomous navigation demands reliable 3D perception for robust obstacle detection in challenging unstructured terrain. While LiDAR is accurate, it is costly and power-intensive. Monocular depth estimation using foundation models offers a lightweight alternative, but its integration into outdoor navigation stacks remains underexplored. We present an open-source off-road navigation stack supporting both LiDAR and monocular 3D perception without task-specific training. For the monocular setup, we combine zero-shot depth prediction (Depth Anything V2) with metric depth rescaling using sparse SLAM measurements (VINS-Mono). Two key enhancements improve robustness: edge-masking to reduce obstacle hallucination and temporal smoothing to mitigate the impact of SLAM instability. The resulting point cloud is used to generate a robot-centric 2.5D elevation map for costmap-based planning. Evaluated in photorealistic simulations (Isaac Sim) and real-world unstructured environments, the monocular configuration matches high-resolution LiDAR performance in most scenarios, demonstrating that foundation-model-based monocular depth estimation is a viable LiDAR alternative for robust off-road navigation. By open-sourcing the navigation stack and the simulation environment, we provide a complete pipeline for off-road navigation as well as a reproducible benchmark. Code available at https://github.com/LARIAD/Offroad-Nav.

Paper Structure

This paper contains 21 sections, 2 equations, 5 figures, 4 tables.

Figures (5)

  • Figure 3: Autonomous navigation with the wheeled ground robot in the real and simulation off-road environment.
  • Figure A1: Diagram of our navigation pipeline. It takes as input either a LiDAR point could or monocular camera images with a SLAM sparse point cloud as well as the robot localization. It returns the velocity commands run by the robot.
  • Figure C1: Estimated rescaled depth maps and the extracted $3$D point clouds from the top RGB image. The left side shows results without edge filtering, while the right side shows results with edge filtering applied.
  • Figure D1: Top views of the simulation experiments. From left to right, the easy, medium and hard simulated environments, respectively. In each environment, the goal points of the $10$, $20$ and $30$ meters scenarios is indicated with an orange, blue and green star, respectively. The reference remotely operated trajectories are given in dashed lines and the ones performed by the robot are shown in solid lines. From top to bottom: the robot trajectories are obtained using the LiDAR (sim-tuned), the LiDAR (real-params) and Mono-VINS (STCD only), respectively.
  • Figure D2: Top view for the real-world experimental area where obstacles are placed on an off-road terrain. The first row shows the LiDAR experiments and the second one, the runs with the monocular camera. From left to right: the easy, medium and hard scenario. Obstacles are represented in white circles and rectangles. The dashed lines correspond to the remotely operated reference trajectories.