Table of Contents
Fetching ...

Image-GS: Content-Adaptive Image Representation via 2D Gaussians

Yunxiang Zhang, Bingxuan Li, Alexandr Kuznetsov, Akshay Jindal, Stavros Diolatzis, Kenneth Chen, Anton Sochenov, Anton Kaplanyan, Qi Sun

TL;DR

Image-GS introduces an explicit, content-adaptive image representation based on anisotropic 2D Gaussians and a custom differentiable renderer. By adaptively spawning Gaussians guided by image gradients and progressively refining them with error-driven additions, it achieves favorable rate-distortion trade-offs and hardware-friendly decoding, with about 0.3K MACs per pixel. The method supports a smooth level-of-detail hierarchy and enables practical applications in semantics-aware compression and joint image compression and restoration. Across a 2K×2K evaluation set and texture stacks, Image-GS outperforms neural baselines at ultra-low bitrates and remains competitive with standard texture codecs, demonstrating strong practical impact for real-time graphics and machine vision workflows.

Abstract

Neural image representations have emerged as a promising approach for encoding and rendering visual data. Combined with learning-based workflows, they demonstrate impressive trade-offs between visual fidelity and memory footprint. Existing methods in this domain, however, often rely on fixed data structures that suboptimally allocate memory or compute-intensive implicit models, hindering their practicality for real-time graphics applications. Inspired by recent advancements in radiance field rendering, we introduce Image-GS, a content-adaptive image representation based on 2D Gaussians. Leveraging a custom differentiable renderer, Image-GS reconstructs images by adaptively allocating and progressively optimizing a group of anisotropic, colored 2D Gaussians. It achieves a favorable balance between visual fidelity and memory efficiency across a variety of stylized images frequently seen in graphics workflows, especially for those showing non-uniformly distributed features and in low-bitrate regimes. Moreover, it supports hardware-friendly rapid random access for real-time usage, requiring only 0.3K MACs to decode a pixel. Through error-guided progressive optimization, Image-GS naturally constructs a smooth level-of-detail hierarchy. We demonstrate its versatility with several applications, including texture compression, semantics-aware compression, and joint image compression and restoration.

Image-GS: Content-Adaptive Image Representation via 2D Gaussians

TL;DR

Image-GS introduces an explicit, content-adaptive image representation based on anisotropic 2D Gaussians and a custom differentiable renderer. By adaptively spawning Gaussians guided by image gradients and progressively refining them with error-driven additions, it achieves favorable rate-distortion trade-offs and hardware-friendly decoding, with about 0.3K MACs per pixel. The method supports a smooth level-of-detail hierarchy and enables practical applications in semantics-aware compression and joint image compression and restoration. Across a 2K×2K evaluation set and texture stacks, Image-GS outperforms neural baselines at ultra-low bitrates and remains competitive with standard texture codecs, demonstrating strong practical impact for real-time graphics and machine vision workflows.

Abstract

Neural image representations have emerged as a promising approach for encoding and rendering visual data. Combined with learning-based workflows, they demonstrate impressive trade-offs between visual fidelity and memory footprint. Existing methods in this domain, however, often rely on fixed data structures that suboptimally allocate memory or compute-intensive implicit models, hindering their practicality for real-time graphics applications. Inspired by recent advancements in radiance field rendering, we introduce Image-GS, a content-adaptive image representation based on 2D Gaussians. Leveraging a custom differentiable renderer, Image-GS reconstructs images by adaptively allocating and progressively optimizing a group of anisotropic, colored 2D Gaussians. It achieves a favorable balance between visual fidelity and memory efficiency across a variety of stylized images frequently seen in graphics workflows, especially for those showing non-uniformly distributed features and in low-bitrate regimes. Moreover, it supports hardware-friendly rapid random access for real-time usage, requiring only 0.3K MACs to decode a pixel. Through error-guided progressive optimization, Image-GS naturally constructs a smooth level-of-detail hierarchy. We demonstrate its versatility with several applications, including texture compression, semantics-aware compression, and joint image compression and restoration.
Paper Structure (32 sections, 8 equations, 19 figures, 1 table)

This paper contains 32 sections, 8 equations, 19 figures, 1 table.

Figures (19)

  • Figure 1: Image-GS optimization pipeline. At initialization, a group of 2D Gaussians is adaptively spawned guided by local image gradient magnitudes, with more allocated to high-frequency areas (\ref{['sec:method-optimization']}). During training, their parameters (\ref{['sec:method-gaussian']}) are optimized using a custom differentiable renderer (\ref{['sec:method-rendering']}) to reconstruct the target, and additional Gaussians are progressively added to areas exhibiting persistent reconstruction errors (\ref{['sec:method-optimization']}). 20% randomly sampled Gaussians are visualized as colored elliptical discs (scale and shape determined by the covariance) to illustrate the optimization progress.
  • Figure 2: Rate-distortion curves (\ref{['sec:evaluation-image']}). These results report the metric scores averaged over the evaluation set of 45 RGB images (\ref{['sec:evaluation-setup']}).
  • Figure 3: System performance (\ref{['sec:evaluation-system']}). These results relate to the image experiments in \ref{['sec:evaluation-image']} and share the same color scheme as \ref{['fig:evaluation-rate-distortion-image']}.
  • Figure 4: Rate-distortion curves on the CLIC2020 benchmark (\ref{['sec:evaluation-image']}).
  • Figure 5: Rate-distortion curves on the 19 texture stacks (\ref{['sec:evaluation-texture']}).
  • ...and 14 more figures