From Noise to Latent: Generating Gaussian Latents for INR-Based Image Compression
Chaoyi Lin, Yaojun Wu, Yue Li, Junru Li, Kai Zhang, Li Zhang
TL;DR
The paper tackles the inefficiency of INR-based image compression by eliminating latent-code transmission and replacing explicit latent storage with latents generated directly from a shared Gaussian noise tensor. A lightweight Gaussian Parameter Predictor estimates per-pixel Gaussian parameters from the noise (with a reparameterization $y_{pred} = \mu_{pred} + \sigma_{pred} \cdot z_M$) to produce image-specific latents, which are then reconstructed by a synthesis network. The approach performs per-image overfitting and uses a seed-signaled, multi-scale noise pyramid to capture spatial priors, achieving competitive rate-distortion on Kodak and CLIC datasets while reducing decoding complexity compared to auto-regressive latent decoders. This work is the first to explore Gaussian latent generation from fixed noise for INR-based compression, offering a practical, lightweight alternative to current latent-code pipelines. Overall, the method demonstrates that generating latents from noise can preserve latent-based benefits without transmitting latent codes, with robust seed behavior and favorable decoding times.
Abstract
Recent implicit neural representation (INR)-based image compression methods have shown competitive performance by overfitting image-specific latent codes. However, they remain inferior to end-to-end (E2E) compression approaches due to the absence of expressive latent representations. On the other hand, E2E methods rely on transmitting latent codes and requiring complex entropy models, leading to increased decoding complexity. Inspired by the normalization strategy in E2E codecs where latents are transformed into Gaussian noise to demonstrate the removal of spatial redundancy, we explore the inverse direction: generating latents directly from Gaussian noise. In this paper, we propose a novel image compression paradigm that reconstructs image-specific latents from a multi-scale Gaussian noise tensor, deterministically generated using a shared random seed. A Gaussian Parameter Prediction (GPP) module estimates the distribution parameters, enabling one-shot latent generation via reparameterization trick. The predicted latent is then passed through a synthesis network to reconstruct the image. Our method eliminates the need to transmit latent codes while preserving latent-based benefits, achieving competitive rate-distortion performance on Kodak and CLIC dataset. To the best of our knowledge, this is the first work to explore Gaussian latent generation for learned image compression.
