Splat-Nav: Safe Real-Time Robot Navigation in Gaussian Splatting Maps
Timothy Chen, Ola Shorinwa, Joseph Bruno, Aiden Swann, Javier Yu, Weijia Zeng, Keiko Nagami, Philip Dames, Mac Schwager
TL;DR
Splat-Nav tackles real-time autonomous navigation on Gaussian Splatting maps by coupling Splat-Plan, which crafts safe polytope corridors and Bézier trajectories within a GSplat-based free space, with Splat-Loc, a monocular pose estimator that uses GSplat-rendered views and PnP to localize the robot without frame alignment. The approach provides safety guarantees through ellipsoidal collision geometry, enables open-vocabulary goals via semantic GSplats, and supports online re-planning at rates exceeding 2 Hz with pose estimation near 25 Hz, demonstrated in both extensive simulations and real hardware (over 120 flights). The key contributions include a fast polytope-corridor generator grounded in ellipsoid tests, a real-time monocular localization framework that leverages GSplat geometry, and comprehensive experiments showing safer, faster, and more robust navigation than point-cloud or NeRF-based baselines. The work has practical impact for real-time robotic navigation in rich 3D scenes, enabling language-driven goals and resilient operation without external frame alignment or dense depth sensing. Future work will address online GSplat SLAM, dynamic scenes, and memory-efficient representations to broaden applicability.
Abstract
We present Splat-Nav, a real-time robot navigation pipeline for Gaussian Splatting (GSplat) scenes, a powerful new 3D scene representation. Splat-Nav consists of two components: 1) Splat-Plan, a safe planning module, and 2) Splat-Loc, a robust vision-based pose estimation module. Splat-Plan builds a safe-by-construction polytope corridor through the map based on mathematically rigorous collision constraints and then constructs a Bézier curve trajectory through this corridor. Splat-Loc provides real-time recursive state estimates given only an RGB feed from an on-board camera, leveraging the point-cloud representation inherent in GSplat scenes. Working together, these modules give robots the ability to recursively re-plan smooth and safe trajectories to goal locations. Goals can be specified with position coordinates, or with language commands by using a semantic GSplat. We demonstrate improved safety compared to point cloud-based methods in extensive simulation experiments. In a total of 126 hardware flights, we demonstrate equivalent safety and speed compared to motion capture and visual odometry, but without a manual frame alignment required by those methods. We show online re-planning at more than 2 Hz and pose estimation at about 25 Hz, an order of magnitude faster than Neural Radiance Field (NeRF)-based navigation methods, thereby enabling real-time navigation. We provide experiment videos on our project page at https://chengine.github.io/splatnav/. Our codebase and ROS nodes can be found at https://github.com/chengine/splatnav.
