FAST-Splat: Fast, Ambiguity-Free Semantics Transfer in Gaussian Splatting

Ola Shorinwa; Jiankai Sun; Mac Schwager

FAST-Splat: Fast, Ambiguity-Free Semantics Transfer in Gaussian Splatting

Ola Shorinwa, Jiankai Sun, Mac Schwager

TL;DR

FAST-Splat addresses slow training/rendering and semantic ambiguity in Gaussian Splatting by introducing neural-free, single-phase semantic distillation. It attaches per-ellipsoid semantic codes and a hash-table, leveraging open-set detectors and CLIP for open-vocabulary grounding, and optimizes geometry, appearance, and semantics jointly with $\mathcal{L}_{\mathrm{sgs}} = \mathcal{L}_{\mathrm{gs}} + \mathcal{L}_{\mathrm{ce}}$. Empirically, it delivers $6\text{x}$–$8\text{x}$ faster training, $18\text{x}$–$51\text{x}$ faster rendering, and ~$6\text{x}$ lower memory with competitive or improved semantic segmentation and disambiguation (e.g., resolving prompts like "tea" to the correct object). This enables precise semantic object localization and 3D masks under open-vocabulary prompts, with potential benefits for robotics and scene editing.

Abstract

We present FAST-Splat for fast, ambiguity-free semantic Gaussian Splatting, which seeks to address the main limitations of existing semantic Gaussian Splatting methods, namely: slow training and rendering speeds; high memory usage; and ambiguous semantic object localization. We take a bottom-up approach in deriving FAST-Splat, dismantling the limitations of closed-set semantic distillation to enable open-set (open-vocabulary) semantic distillation. Ultimately, this key approach enables FAST-Splat to provide precise semantic object localization results, even when prompted with ambiguous user-provided natural-language queries. Further, by exploiting the explicit form of the Gaussian Splatting scene representation to the fullest extent, FAST-Splat retains the remarkable training and rendering speeds of Gaussian Splatting. Precisely, while existing semantic Gaussian Splatting methods distill semantics into a separate neural field or utilize neural models for dimensionality reduction, FAST-Splat directly augments each Gaussian with specific semantic codes, preserving the training, rendering, and memory-usage advantages of Gaussian Splatting over neural field methods. These Gaussian-specific semantic codes, together with a hash-table, enable semantic similarity to be measured with open-vocabulary user prompts and further enable FAST-Splat to respond with unambiguous semantic object labels and $3$D masks, unlike prior methods. In experiments, we demonstrate that FAST-Splat is 6x to 8x faster to train, achieves between 18x to 51x faster rendering speeds, and requires about 6x smaller GPU memory, compared to the best-competing semantic Gaussian Splatting methods. Further, FAST-Splat achieves relatively similar or better semantic segmentation performance compared to existing methods. After the review period, we will provide links to the project website and the codebase.

FAST-Splat: Fast, Ambiguity-Free Semantics Transfer in Gaussian Splatting

TL;DR

Abstract

FAST-Splat: Fast, Ambiguity-Free Semantics Transfer in Gaussian Splatting

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (12)