Table of Contents
Fetching ...

TC-GS: Tri-plane based compression for 3D Gaussian Splatting

Taorui Wang, Zitong Yu, Yong Xu

TL;DR

The paper tackles the high storage requirements of 3D Gaussian Splatting by introducing a Tri-plane based compression framework that captures spatial correlations among unorganized Gaussians. It uses a Tri-plane context model with a KNN-augmented decoder and a Gaussian position prior, combined with anchor masking, Tri-plane compression, and an adaptive wavelet loss to improve both compression and visual fidelity. The method achieves substantial storage reductions—more than 100 times versus vanilla 3DGS and more than 17 times versus Scaffold-GS—while maintaining or improving perceptual quality via LPIPS and standard metrics. These results demonstrate a practical path toward scalable, high-fidelity 3D scene representations, with software released for reproducibility.

Abstract

Recently, 3D Gaussian Splatting (3DGS) has emerged as a prominent framework for novel view synthesis, providing high fidelity and rapid rendering speed. However, the substantial data volume of 3DGS and its attributes impede its practical utility, requiring compression techniques for reducing memory cost. Nevertheless, the unorganized shape of 3DGS leads to difficulties in compression. To formulate unstructured attributes into normative distribution, we propose a well-structured tri-plane to encode Gaussian attributes, leveraging the distribution of attributes for compression. To exploit the correlations among adjacent Gaussians, K-Nearest Neighbors (KNN) is used when decoding Gaussian distribution from the Tri-plane. We also introduce Gaussian position information as a prior of the position-sensitive decoder. Additionally, we incorporate an adaptive wavelet loss, aiming to focus on the high-frequency details as iterations increase. Our approach has achieved results that are comparable to or surpass that of SOTA 3D Gaussians Splatting compression work in extensive experiments across multiple datasets. The codes are released at https://github.com/timwang2001/TC-GS.

TC-GS: Tri-plane based compression for 3D Gaussian Splatting

TL;DR

The paper tackles the high storage requirements of 3D Gaussian Splatting by introducing a Tri-plane based compression framework that captures spatial correlations among unorganized Gaussians. It uses a Tri-plane context model with a KNN-augmented decoder and a Gaussian position prior, combined with anchor masking, Tri-plane compression, and an adaptive wavelet loss to improve both compression and visual fidelity. The method achieves substantial storage reductions—more than 100 times versus vanilla 3DGS and more than 17 times versus Scaffold-GS—while maintaining or improving perceptual quality via LPIPS and standard metrics. These results demonstrate a practical path toward scalable, high-fidelity 3D scene representations, with software released for reproducibility.

Abstract

Recently, 3D Gaussian Splatting (3DGS) has emerged as a prominent framework for novel view synthesis, providing high fidelity and rapid rendering speed. However, the substantial data volume of 3DGS and its attributes impede its practical utility, requiring compression techniques for reducing memory cost. Nevertheless, the unorganized shape of 3DGS leads to difficulties in compression. To formulate unstructured attributes into normative distribution, we propose a well-structured tri-plane to encode Gaussian attributes, leveraging the distribution of attributes for compression. To exploit the correlations among adjacent Gaussians, K-Nearest Neighbors (KNN) is used when decoding Gaussian distribution from the Tri-plane. We also introduce Gaussian position information as a prior of the position-sensitive decoder. Additionally, we incorporate an adaptive wavelet loss, aiming to focus on the high-frequency details as iterations increase. Our approach has achieved results that are comparable to or surpass that of SOTA 3D Gaussians Splatting compression work in extensive experiments across multiple datasets. The codes are released at https://github.com/timwang2001/TC-GS.

Paper Structure

This paper contains 13 sections, 7 equations, 4 figures, 2 tables.

Figures (4)

  • Figure 1: Comparison of two baselines and our model. From top to bottom: 3DGS kerbl20233d, Scaffold-GS lu2024scaffold and TC-GS (ours). We conduct a simple experiment on the 'bicycle' scene in the Mip-nerf360 dataset barron2022mip. The results highlight the superiority of TC-GS in achieving both reduced Gaussian quantity and significantly improved storage efficiency. This demonstrates the potential of our method to handle complex scenes with minimal resource requirements, making it a robust solution for scalable applications.
  • Figure 2: Overview of our model. It follows Scaffold-GS lu2024scaffold, which introduces anchors to a compact representation of neural Gaussians. Top left: Our framework jointly learns contiguous Tri-plane while neural Gaussians rasterization and compressed with downsampling to reduce the storage cost of Tri-plane. Right: Our context model exploits the output of Tri-plane as a context model to predict the distribution of anchor attributes. Then compressed with entropy encoding. Bottom left: To ensure high-frequency performance, e.g., edges of objects, we propose an adaptive wavelet constraint that leads the model to focus on low-frequency at the beginning and, as learning proceeds, on high-frequency features.
  • Figure 3: Qualitative Results on "train" and "truck" from Tanks and temples tankstemple and "drjohnson" from DeepBlending deepblending. The PSNR and storage costs are given on the lower left.
  • Figure 4: Qualitative comparisons on the 'train' scene from Tanks and Temples tankstemple. The left column presents rendered images from novel views, while the middle column shows the corresponding error maps, computed as the absolute pixel-wise differences between the rendered outputs and the ground truth. In these error maps, darker areas indicate higher accuracy, i.e., smaller deviations indicate better rendering quality. Our approach demonstrates superior performance, not only in faithfully reproducing the main item but also in capturing intricate details and substructures, such as the rail and shadows, surpassing the baseline method lu2024scaffold. Note that we brighten up the error maps to enhance visibility.