Table of Contents
Fetching ...

Gaussian-Det: Learning Closed-Surface Gaussians for 3D Object Detection

Hongru Yan, Yu Zheng, Yueqi Duan

TL;DR

This paper tackles RGB-based 3D object detection by shifting from discrete point or NeRF representations to a continuous surface model built from 3D Gaussian Splatting. It introduces Gaussian-Det, which encodes objects as a mass of surface-describing Gaussians and refines proposals via a Closure Inference Module that jointly handles partial surface uncertainty with a variational residual and holistic surface closure quantified through a flux $|\hat{\boldsymbol{\Phi}}|$. The CIM produces a probabilistic, closure-aware holistic representation that serves as a prior to improve detection reliability, yielding superior AP and AR on indoor datasets (3D-FRONT and ScanNet) while maintaining real-time performance. The approach demonstrates strong robustness to outliers and noisy poses, and it broadens the use of surface-based priors for 3D perception tasks with potential extensions to open-vocabulary 3D instance segmentation.

Abstract

Skins wrapping around our bodies, leathers covering over the sofa, sheet metal coating the car - it suggests that objects are enclosed by a series of continuous surfaces, which provides us with informative geometry prior for objectness deduction. In this paper, we propose Gaussian-Det which leverages Gaussian Splatting as surface representation for multi-view based 3D object detection. Unlike existing monocular or NeRF-based methods which depict the objects via discrete positional data, Gaussian-Det models the objects in a continuous manner by formulating the input Gaussians as feature descriptors on a mass of partial surfaces. Furthermore, to address the numerous outliers inherently introduced by Gaussian splatting, we accordingly devise a Closure Inferring Module (CIM) for the comprehensive surface-based objectness deduction. CIM firstly estimates the probabilistic feature residuals for partial surfaces given the underdetermined nature of Gaussian Splatting, which are then coalesced into a holistic representation on the overall surface closure of the object proposal. In this way, the surface information Gaussian-Det exploits serves as the prior on the quality and reliability of objectness and the information basis of proposal refinement. Experiments on both synthetic and real-world datasets demonstrate that Gaussian-Det outperforms various existing approaches, in terms of both average precision and recall.

Gaussian-Det: Learning Closed-Surface Gaussians for 3D Object Detection

TL;DR

This paper tackles RGB-based 3D object detection by shifting from discrete point or NeRF representations to a continuous surface model built from 3D Gaussian Splatting. It introduces Gaussian-Det, which encodes objects as a mass of surface-describing Gaussians and refines proposals via a Closure Inference Module that jointly handles partial surface uncertainty with a variational residual and holistic surface closure quantified through a flux . The CIM produces a probabilistic, closure-aware holistic representation that serves as a prior to improve detection reliability, yielding superior AP and AR on indoor datasets (3D-FRONT and ScanNet) while maintaining real-time performance. The approach demonstrates strong robustness to outliers and noisy poses, and it broadens the use of surface-based priors for 3D perception tasks with potential extensions to open-vocabulary 3D instance segmentation.

Abstract

Skins wrapping around our bodies, leathers covering over the sofa, sheet metal coating the car - it suggests that objects are enclosed by a series of continuous surfaces, which provides us with informative geometry prior for objectness deduction. In this paper, we propose Gaussian-Det which leverages Gaussian Splatting as surface representation for multi-view based 3D object detection. Unlike existing monocular or NeRF-based methods which depict the objects via discrete positional data, Gaussian-Det models the objects in a continuous manner by formulating the input Gaussians as feature descriptors on a mass of partial surfaces. Furthermore, to address the numerous outliers inherently introduced by Gaussian splatting, we accordingly devise a Closure Inferring Module (CIM) for the comprehensive surface-based objectness deduction. CIM firstly estimates the probabilistic feature residuals for partial surfaces given the underdetermined nature of Gaussian Splatting, which are then coalesced into a holistic representation on the overall surface closure of the object proposal. In this way, the surface information Gaussian-Det exploits serves as the prior on the quality and reliability of objectness and the information basis of proposal refinement. Experiments on both synthetic and real-world datasets demonstrate that Gaussian-Det outperforms various existing approaches, in terms of both average precision and recall.
Paper Structure (30 sections, 1 theorem, 20 equations, 15 figures, 11 tables)

This paper contains 30 sections, 1 theorem, 20 equations, 15 figures, 11 tables.

Key Result

Theorem 1

Given a constant vector field $\mathbf{T}$ and a closed surface $\mathbf{S}$, the flux $\mathbf{\Phi}$ of the vector field $\mathbf{T}$ through the closed surface $\mathbf{S}$, also expressed as the surface integral of the $\mathbf{T}$ over $\mathbf{S}$ is zero:

Figures (15)

  • Figure 1: We leverage Gaussian Splatting as surface representation for multi-view based 3D object detection. Top Left: Gaussians with color and coordinate. Top Right: Gaussians with color, coordinate and surface information, but in a much fewer amount. Bottom (a-b): Object composed of Gaussians exhibits a textured surface as opposed to the top left illustration. Bottom (c-d): However, Gaussian splatting inherently introduces numerous outliers around the object.
  • Figure 2: Illustration of the surface closure prior in Gaussian-Det. For 3D Gaussians within a bounding box, those forming relatively closed surfaces (lower $|\Phi|$) indicate an accurate detection, and vice versa. Note that 3D Gaussians outside the bounding box are not shown for clarity.
  • Figure 3: From the original Gaussians $\mathbb{G}$, we formulate surface-based Gaussian representation $\mathbb{G}^{surf}$, which is firstly used to predict the initial object proposal with opener surfaces. Then the closure inferring module, which contains partial surface feature inference and holistic surface closure coalescence. CIM learns a informative prior on the quality of the predicted objectness, thus controlling the support (*X) or suppression (*X) of the holistic object representation. The partial and holistic representations are combined to estimate the refined detection result, whose degree of surface closure is similar to that of ground truth measured by the absolute flux value $|\Phi_k|$.
  • Figure 4: Qualitative Results on 3D-FRONT (top two rows) and ScanNet (bottom two rows). We visualize each bounding box with a unique color for clear illustration.
  • Figure 5: Failure case on occluded objects.
  • ...and 10 more figures

Theorems & Definitions (1)

  • Theorem 1