Table of Contents
Fetching ...

HaloGS: Loose Coupling of Compact Geometry and Gaussian Splats for 3D Scenes

Changjian Jiang, Kerui Ren, Linning Xu, Jiong Chen, Jiangmiao Pang, Yu Zhang, Bo Dai, Mulin Yu

TL;DR

HaloGS addresses the trade-off between geometric fidelity and photorealistic rendering by decoupling geometry and appearance into a dual representation: low-frequency geometry is captured by learnable triangle primitives, while high-frequency texture and lighting are rendered with neural Gaussians attached to those triangles, encapsulated in $G(\mathbf{x}) = e^{- rac{1}{2} (\mathbf{x}-\boldsymbol{\mu})^T \boldsymbol{\Sigma}^{-1} (\mathbf{x}-\boldsymbol{\mu})}$. The method trains in a coarse-to-fine manner using monocular priors $\mathbf{D}_{\text{ref}}$ and $\mathbf{N}_{\text{ref}}$ to guide triangle geometry, followed by rendering-based refinement via Gaussians and depth/normal feedback to triangles. HaloGS extracts LoD planar abstractions and stitches them into compact meshes, achieving strong rendering quality across indoor and outdoor scenes while reducing geometry complexity relative to dense approaches. While effective, the approach has limitations with semi-transparent materials and distant background regions, suggesting future work on higher-order primitives and dedicated background modeling.

Abstract

High fidelity 3D reconstruction and rendering hinge on capturing precise geometry while preserving photo realistic detail. Most existing methods either fuse these goals into a single cumbersome model or adopt hybrid schemes whose uniform primitives lead to a trade off between efficiency and fidelity. In this paper, we introduce HaloGS, a dual representation that loosely couples coarse triangles for geometry with Gaussian primitives for appearance, motivated by the lightweight classic geometry representations and their proven efficiency in real world applications. Our design yields a compact yet expressive model capable of photo realistic rendering across both indoor and outdoor environments, seamlessly adapting to varying levels of scene complexity. Experiments on multiple benchmark datasets demonstrate that our method yields both compact, accurate geometry and high fidelity renderings, especially in challenging scenarios where robust geometric structure make a clear difference.

HaloGS: Loose Coupling of Compact Geometry and Gaussian Splats for 3D Scenes

TL;DR

HaloGS addresses the trade-off between geometric fidelity and photorealistic rendering by decoupling geometry and appearance into a dual representation: low-frequency geometry is captured by learnable triangle primitives, while high-frequency texture and lighting are rendered with neural Gaussians attached to those triangles, encapsulated in . The method trains in a coarse-to-fine manner using monocular priors and to guide triangle geometry, followed by rendering-based refinement via Gaussians and depth/normal feedback to triangles. HaloGS extracts LoD planar abstractions and stitches them into compact meshes, achieving strong rendering quality across indoor and outdoor scenes while reducing geometry complexity relative to dense approaches. While effective, the approach has limitations with semi-transparent materials and distant background regions, suggesting future work on higher-order primitives and dedicated background modeling.

Abstract

High fidelity 3D reconstruction and rendering hinge on capturing precise geometry while preserving photo realistic detail. Most existing methods either fuse these goals into a single cumbersome model or adopt hybrid schemes whose uniform primitives lead to a trade off between efficiency and fidelity. In this paper, we introduce HaloGS, a dual representation that loosely couples coarse triangles for geometry with Gaussian primitives for appearance, motivated by the lightweight classic geometry representations and their proven efficiency in real world applications. Our design yields a compact yet expressive model capable of photo realistic rendering across both indoor and outdoor environments, seamlessly adapting to varying levels of scene complexity. Experiments on multiple benchmark datasets demonstrate that our method yields both compact, accurate geometry and high fidelity renderings, especially in challenging scenarios where robust geometric structure make a clear difference.

Paper Structure

This paper contains 25 sections, 12 equations, 14 figures, 12 tables.

Figures (14)

  • Figure 1: HaloGS presents a dual‐representation framework that disentangles geometry from appearance for multiview reconstruction. Geometrically, it represents scene structure as a surface‐aligned triangle soup and augments it with neural Gaussians for photorealistic rendering. From this soup, we progressively extract planar primitives and assemble them into compact meshes. HaloGS combines high‑fidelity appearance with lightweight geometry for efficient storage and downstream processing. The flexible triangle primitive makes it handle both indoor and outdoor environments easily, adapting seamlessly to varying levels of detail and scene complexity. Here, we illustrate with the large-scale MatrixCity li2023matrixcity scene, please visit our project page for additional results: https://city-super.github.io/halogs/ .
  • Figure 2: Overview of HaloGS. Our proposed dual-representation is illustrated in (a), where learnable triangles explicitly fit the scene geometry, and neural Gaussians decoded from these triangles render the appearance. In (b), we depict our coarse-to-fine training strategy: during the coarse stage, monocular geometric priors supervise the positions and shapes of the triangles. Subsequently, in the fine stage, neural Gaussians decoded from these half-trained triangles are optimized using ground truth images. Concurrently, depth and normal maps rendered from the neural Gaussians provide additional refinement feedback to further enhance the triangle representation.
  • Figure 3: We evaluate our method against state-of-the-art approaches kerbl20233dhuang20242dheld20243dlu2024scaffoldren2024octree on challenging Zip-NeRF and VR-NeRF scenes that span both expansive layouts and intricate details. Colored patches draw attention to areas where our approach excels, faithfully reconstructing fine structures and complex planar surfaces, such as wall-mounted mirrors and intricate ceiling ornaments, which existing baselines struggle to capture.
  • Figure 4: Geometric reconstruction comparison. We visualize the learned geometric representations our Triangle Soup, 2DGS meshes, our extracted planes, and 2DGS planes on two representative datasets. The top four rows show ScanNet++ indoor scenes with available ground‑truth meshes; insets highlight fine structural details. The bottom two rows present results on FAST‑LIVO2 outdoor captures with ground‑truth point clouds. Our Triangle Soup faithfully preserves geometry fidelity, capturing sharp edges and fine details.
  • Figure 5: Visualization of the compact mesh. We present two representative scenes: Raf_emptyroom (from VR‑NeRF) and Garden (from MipNeRF‑360) in both indoor and outdoor settings.
  • ...and 9 more figures