Table of Contents
Fetching ...

Compact 3D Gaussian Representation for Radiance Field

Joo Chan Lee, Daniel Rho, Xiangyu Sun, Jong Hwan Ko, Eunbyung Park

TL;DR

The paper tackles the memory and speed bottlenecks of real-time 3D scene representation by extending 3D Gaussian Splatting with a learnable volume mask to prune Gaussians, a grid-based neural field to compress view-dependent color, and a residual vector-quantized geometry codebook. Together, these components drastically reduce storage and boost rendering speed while preserving reconstruction quality across diverse datasets, achieving substantial compression and near real-time performance. The approach yields clear gains over 3DGS, especially on challenging real-world data like Deep Blending, and is supported by extensive ablations and implementation details. This work paves the way for compact, high-fidelity, real-time 3D scene representations suitable for interactive applications and scalable deployment.

Abstract

Neural Radiance Fields (NeRFs) have demonstrated remarkable potential in capturing complex 3D scenes with high fidelity. However, one persistent challenge that hinders the widespread adoption of NeRFs is the computational bottleneck due to the volumetric rendering. On the other hand, 3D Gaussian splatting (3DGS) has recently emerged as an alternative representation that leverages a 3D Gaussisan-based representation and adopts the rasterization pipeline to render the images rather than volumetric rendering, achieving very fast rendering speed and promising image quality. However, a significant drawback arises as 3DGS entails a substantial number of 3D Gaussians to maintain the high fidelity of the rendered images, which requires a large amount of memory and storage. To address this critical issue, we place a specific emphasis on two key objectives: reducing the number of Gaussian points without sacrificing performance and compressing the Gaussian attributes, such as view-dependent color and covariance. To this end, we propose a learnable mask strategy that significantly reduces the number of Gaussians while preserving high performance. In addition, we propose a compact but effective representation of view-dependent color by employing a grid-based neural field rather than relying on spherical harmonics. Finally, we learn codebooks to compactly represent the geometric attributes of Gaussian by vector quantization. With model compression techniques such as quantization and entropy coding, we consistently show over 25$\times$ reduced storage and enhanced rendering speed, while maintaining the quality of the scene representation, compared to 3DGS. Our work provides a comprehensive framework for 3D scene representation, achieving high performance, fast training, compactness, and real-time rendering. Our project page is available at https://maincold2.github.io/c3dgs/.

Compact 3D Gaussian Representation for Radiance Field

TL;DR

The paper tackles the memory and speed bottlenecks of real-time 3D scene representation by extending 3D Gaussian Splatting with a learnable volume mask to prune Gaussians, a grid-based neural field to compress view-dependent color, and a residual vector-quantized geometry codebook. Together, these components drastically reduce storage and boost rendering speed while preserving reconstruction quality across diverse datasets, achieving substantial compression and near real-time performance. The approach yields clear gains over 3DGS, especially on challenging real-world data like Deep Blending, and is supported by extensive ablations and implementation details. This work paves the way for compact, high-fidelity, real-time 3D scene representations suitable for interactive applications and scalable deployment.

Abstract

Neural Radiance Fields (NeRFs) have demonstrated remarkable potential in capturing complex 3D scenes with high fidelity. However, one persistent challenge that hinders the widespread adoption of NeRFs is the computational bottleneck due to the volumetric rendering. On the other hand, 3D Gaussian splatting (3DGS) has recently emerged as an alternative representation that leverages a 3D Gaussisan-based representation and adopts the rasterization pipeline to render the images rather than volumetric rendering, achieving very fast rendering speed and promising image quality. However, a significant drawback arises as 3DGS entails a substantial number of 3D Gaussians to maintain the high fidelity of the rendered images, which requires a large amount of memory and storage. To address this critical issue, we place a specific emphasis on two key objectives: reducing the number of Gaussian points without sacrificing performance and compressing the Gaussian attributes, such as view-dependent color and covariance. To this end, we propose a learnable mask strategy that significantly reduces the number of Gaussians while preserving high performance. In addition, we propose a compact but effective representation of view-dependent color by employing a grid-based neural field rather than relying on spherical harmonics. Finally, we learn codebooks to compactly represent the geometric attributes of Gaussian by vector quantization. With model compression techniques such as quantization and entropy coding, we consistently show over 25 reduced storage and enhanced rendering speed, while maintaining the quality of the scene representation, compared to 3DGS. Our work provides a comprehensive framework for 3D scene representation, achieving high performance, fast training, compactness, and real-time rendering. Our project page is available at https://maincold2.github.io/c3dgs/.
Paper Structure (22 sections, 7 equations, 8 figures, 11 tables)

This paper contains 22 sections, 7 equations, 8 figures, 11 tables.

Figures (8)

  • Figure 1: Our method achieves reduced storage and faster rendering speed while maintaining high-quality reconstruction of 3DGS 3dgs. The core idea is to effectively remove the redundant Gaussians that do not significantly contribute to the overall performance (the sparser distribution of Gaussian points and reduced ellipsoid redundancy shown in the figure). We also introduce a more compact representation of Gaussian attributes, resulting in markedly improved storage efficiency and rendering speed.
  • Figure 2: The detailed architecture of our proposed compact 3D Gaussian.
  • Figure 3: Visualization of the varying count of Gaussians during training (Bonsai scene). '# Gaussians' denotes the number of Gaussians.
  • Figure 4: The detailed process of R-VQ to represent the scale and rotation of Gaussians. In the first stage, the scale and rotation vectors are compared to codes in each codebook, with the closest code identified as the result. In the next stage, the residual between the original vector and the first stage's result is compared with another codebook. This process is repeated up to the final stage, as a result, the selected indices and the codebook from each stage collectively represent the original vector.
  • Figure 5: Qualitative results of our approach compared to 3DGS. We present the rendering PSNR and storage on the results.
  • ...and 3 more figures