Table of Contents
Fetching ...

Large Images are Gaussians: High-Quality Large Image Representation with Levels of 2D Gaussian Splatting

Lingting Zhu, Guying Lin, Jinnan Chen, Xinjie Zhang, Zhenchao Jin, Zhao Wang, Lequan Yu

TL;DR

Large INRs struggle with memory and decoding speed for high-resolution images. By adapting 2D Gaussian Splatting with a covariance-focused representation and introducing a Level-of-Gaussian hierarchy, LIG enables efficient fitting of large images using many Gaussian points. The two-stage LOG training yields higher fidelity with reduced memory compared to prior GS-based methods, while maintaining competitive rendering speed. This work broadens the applicability of Gaussian-based image representations to large-scale signals and suggests avenues for integrating GS with INR-style techniques for even greater efficiency.

Abstract

While Implicit Neural Representations (INRs) have demonstrated significant success in image representation, they are often hindered by large training memory and slow decoding speed. Recently, Gaussian Splatting (GS) has emerged as a promising solution in 3D reconstruction due to its high-quality novel view synthesis and rapid rendering capabilities, positioning it as a valuable tool for a broad spectrum of applications. In particular, a GS-based representation, 2DGS, has shown potential for image fitting. In our work, we present \textbf{L}arge \textbf{I}mages are \textbf{G}aussians (\textbf{LIG}), which delves deeper into the application of 2DGS for image representations, addressing the challenge of fitting large images with 2DGS in the situation of numerous Gaussian points, through two distinct modifications: 1) we adopt a variant of representation and optimization strategy, facilitating the fitting of a large number of Gaussian points; 2) we propose a Level-of-Gaussian approach for reconstructing both coarse low-frequency initialization and fine high-frequency details. Consequently, we successfully represent large images as Gaussian points and achieve high-quality large image representation, demonstrating its efficacy across various types of large images. Code is available at {\href{https://github.com/HKU-MedAI/LIG}{https://github.com/HKU-MedAI/LIG}}.

Large Images are Gaussians: High-Quality Large Image Representation with Levels of 2D Gaussian Splatting

TL;DR

Large INRs struggle with memory and decoding speed for high-resolution images. By adapting 2D Gaussian Splatting with a covariance-focused representation and introducing a Level-of-Gaussian hierarchy, LIG enables efficient fitting of large images using many Gaussian points. The two-stage LOG training yields higher fidelity with reduced memory compared to prior GS-based methods, while maintaining competitive rendering speed. This work broadens the applicability of Gaussian-based image representations to large-scale signals and suggests avenues for integrating GS with INR-style techniques for even greater efficiency.

Abstract

While Implicit Neural Representations (INRs) have demonstrated significant success in image representation, they are often hindered by large training memory and slow decoding speed. Recently, Gaussian Splatting (GS) has emerged as a promising solution in 3D reconstruction due to its high-quality novel view synthesis and rapid rendering capabilities, positioning it as a valuable tool for a broad spectrum of applications. In particular, a GS-based representation, 2DGS, has shown potential for image fitting. In our work, we present \textbf{L}arge \textbf{I}mages are \textbf{G}aussians (\textbf{LIG}), which delves deeper into the application of 2DGS for image representations, addressing the challenge of fitting large images with 2DGS in the situation of numerous Gaussian points, through two distinct modifications: 1) we adopt a variant of representation and optimization strategy, facilitating the fitting of a large number of Gaussian points; 2) we propose a Level-of-Gaussian approach for reconstructing both coarse low-frequency initialization and fine high-frequency details. Consequently, we successfully represent large images as Gaussian points and achieve high-quality large image representation, demonstrating its efficacy across various types of large images. Code is available at {\href{https://github.com/HKU-MedAI/LIG}{https://github.com/HKU-MedAI/LIG}}.

Paper Structure

This paper contains 13 sections, 6 equations, 4 figures, 5 tables.

Figures (4)

  • Figure 1: Comparison of LIG and GaussianImage on large image fitting quality. GaussianImage performs badly when optimizing a large number of Gaussian points on images of high resolutions, whereas ours consistently delivers quality improvements as the number of Gaussian points increases. The phenomenon is observed in multiple datasets.
  • Figure 2: LIG is capable of representing large images with high quality. We show cases including a histopathology image and a satellite image, showing multi-resolution patches with PSNR values displayed at the bottom-right corner of each image.
  • Figure 3: Illustration of our proposed Level-of-Gaussian approach, aiming at fitting large images with two levels of Gaussian points. In the first stage, we allocate parts of Gaussian points to form $L_0$ Gaussians for learning the low-frequency initialization from the down-sampled image. In the second stage, $L_1$ Gaussians learn the high-frequency details on the difference between the up-sampled estimation and the target. We present the abstract values of the difference and enhance the image for visualization.
  • Figure 4: Qualitative comparison between LIG and GaussianImage on STimage and FGF2 samples. We show small patches from the rendered images and the GT images. The difference images are shift to 0.5 for visualization.