Table of Contents
Fetching ...

Splat-Nav: Safe Real-Time Robot Navigation in Gaussian Splatting Maps

Timothy Chen, Ola Shorinwa, Joseph Bruno, Aiden Swann, Javier Yu, Weijia Zeng, Keiko Nagami, Philip Dames, Mac Schwager

TL;DR

Splat-Nav tackles real-time autonomous navigation on Gaussian Splatting maps by coupling Splat-Plan, which crafts safe polytope corridors and Bézier trajectories within a GSplat-based free space, with Splat-Loc, a monocular pose estimator that uses GSplat-rendered views and PnP to localize the robot without frame alignment. The approach provides safety guarantees through ellipsoidal collision geometry, enables open-vocabulary goals via semantic GSplats, and supports online re-planning at rates exceeding 2 Hz with pose estimation near 25 Hz, demonstrated in both extensive simulations and real hardware (over 120 flights). The key contributions include a fast polytope-corridor generator grounded in ellipsoid tests, a real-time monocular localization framework that leverages GSplat geometry, and comprehensive experiments showing safer, faster, and more robust navigation than point-cloud or NeRF-based baselines. The work has practical impact for real-time robotic navigation in rich 3D scenes, enabling language-driven goals and resilient operation without external frame alignment or dense depth sensing. Future work will address online GSplat SLAM, dynamic scenes, and memory-efficient representations to broaden applicability.

Abstract

We present Splat-Nav, a real-time robot navigation pipeline for Gaussian Splatting (GSplat) scenes, a powerful new 3D scene representation. Splat-Nav consists of two components: 1) Splat-Plan, a safe planning module, and 2) Splat-Loc, a robust vision-based pose estimation module. Splat-Plan builds a safe-by-construction polytope corridor through the map based on mathematically rigorous collision constraints and then constructs a Bézier curve trajectory through this corridor. Splat-Loc provides real-time recursive state estimates given only an RGB feed from an on-board camera, leveraging the point-cloud representation inherent in GSplat scenes. Working together, these modules give robots the ability to recursively re-plan smooth and safe trajectories to goal locations. Goals can be specified with position coordinates, or with language commands by using a semantic GSplat. We demonstrate improved safety compared to point cloud-based methods in extensive simulation experiments. In a total of 126 hardware flights, we demonstrate equivalent safety and speed compared to motion capture and visual odometry, but without a manual frame alignment required by those methods. We show online re-planning at more than 2 Hz and pose estimation at about 25 Hz, an order of magnitude faster than Neural Radiance Field (NeRF)-based navigation methods, thereby enabling real-time navigation. We provide experiment videos on our project page at https://chengine.github.io/splatnav/. Our codebase and ROS nodes can be found at https://github.com/chengine/splatnav.

Splat-Nav: Safe Real-Time Robot Navigation in Gaussian Splatting Maps

TL;DR

Splat-Nav tackles real-time autonomous navigation on Gaussian Splatting maps by coupling Splat-Plan, which crafts safe polytope corridors and Bézier trajectories within a GSplat-based free space, with Splat-Loc, a monocular pose estimator that uses GSplat-rendered views and PnP to localize the robot without frame alignment. The approach provides safety guarantees through ellipsoidal collision geometry, enables open-vocabulary goals via semantic GSplats, and supports online re-planning at rates exceeding 2 Hz with pose estimation near 25 Hz, demonstrated in both extensive simulations and real hardware (over 120 flights). The key contributions include a fast polytope-corridor generator grounded in ellipsoid tests, a real-time monocular localization framework that leverages GSplat geometry, and comprehensive experiments showing safer, faster, and more robust navigation than point-cloud or NeRF-based baselines. The work has practical impact for real-time robotic navigation in rich 3D scenes, enabling language-driven goals and resilient operation without external frame alignment or dense depth sensing. Future work will address online GSplat SLAM, dynamic scenes, and memory-efficient representations to broaden applicability.

Abstract

We present Splat-Nav, a real-time robot navigation pipeline for Gaussian Splatting (GSplat) scenes, a powerful new 3D scene representation. Splat-Nav consists of two components: 1) Splat-Plan, a safe planning module, and 2) Splat-Loc, a robust vision-based pose estimation module. Splat-Plan builds a safe-by-construction polytope corridor through the map based on mathematically rigorous collision constraints and then constructs a Bézier curve trajectory through this corridor. Splat-Loc provides real-time recursive state estimates given only an RGB feed from an on-board camera, leveraging the point-cloud representation inherent in GSplat scenes. Working together, these modules give robots the ability to recursively re-plan smooth and safe trajectories to goal locations. Goals can be specified with position coordinates, or with language commands by using a semantic GSplat. We demonstrate improved safety compared to point cloud-based methods in extensive simulation experiments. In a total of 126 hardware flights, we demonstrate equivalent safety and speed compared to motion capture and visual odometry, but without a manual frame alignment required by those methods. We show online re-planning at more than 2 Hz and pose estimation at about 25 Hz, an order of magnitude faster than Neural Radiance Field (NeRF)-based navigation methods, thereby enabling real-time navigation. We provide experiment videos on our project page at https://chengine.github.io/splatnav/. Our codebase and ROS nodes can be found at https://github.com/chengine/splatnav.
Paper Structure (32 sections, 9 theorems, 24 equations, 12 figures, 7 tables)

This paper contains 32 sections, 9 theorems, 24 equations, 12 figures, 7 tables.

Key Result

Theorem 1

Given two ellipsoids $\mathcal{E}_a, \mathcal{E}_b$ (with means $\mu_a, \mu_b$ and covariances $\Sigma_a, \Sigma_b$) and the concave function ${K: (0, 1) \to \mathbb{R}}$, $\mathcal{E}_a \cap \mathcal{E}_b = \emptyset$ if and only if there exists $s\in (0, 1)$ such that $K(s) > 1$.

Figures (12)

  • Figure 1: Splat-Nav, consists of a safe planning module, Splat-Plan, and robust localization module, Splat-Loc, both operating on a Gaussian Splatting environment representation. In Splat-Plan we develop a fast, new ellipsoid-ellipsoid collision test to find a safe flight corridor through the GSplat, and plan a spline through the corridor. In Splat-Loc we localize the robot using only RGB images through a PnP algorithm, using the GSplat to render a point cloud. We use a language-embedded GSplat to enable open-vocabulary specification of goal locations like "go to the microwave."
  • Figure 2: Visualization of a point cloud from a NeRF and a mesh from a Gaussian Splat in the synthetic scene Stonehenge. The Chamfer Distance between the NeRF and ground-truth is 0.081 (with 4M points). The Chamfer distance between the GSplat and ground-truth is 0.031 (with 3M vertices). The collision geometry (especially the foreground) of the GSplat is better and can be extracted instantaneously from the model parameters compared to the costly rendering procedure from many viewpoints to create a point cloud from the NeRF.
  • Figure 3: $K(s)$ Bisection Search
  • Figure 4: Splat-Plan, as described by \ref{['alg:splat_plan']}. Given a GSplat and its corresponding ellipsoidal collision geometry, Splat-Plan generates a binary occupancy grid representing the collision-less free space. Next, a seed path is created using graph-based search. \ref{['corr:supp-hyperplane-line']} synthesizes a set of connected polytopes forming a corridor around the feasible path. Finally, a quadratic program is solved for the control points of a sequence of Bézier curves that lives entirely within the corridor and hence is safe.
  • Figure 5: Splat-Plan
  • ...and 7 more figures

Theorems & Definitions (19)

  • Remark 1
  • Remark 2: Online Gaussian Splatting
  • Remark 3: Handling Uncertainty of the Scene Representation
  • Remark 4: Dynamic Scenes
  • Theorem 1
  • Proposition 1
  • proof
  • Corollary 1
  • Corollary 2
  • proof
  • ...and 9 more