RECOMBINER: Robust and Enhanced Compression with Bayesian Implicit Neural Representations

Jiajun He; Gergely Flamich; Zongyu Guo; José Miguel Hernández-Lobato

RECOMBINER: Robust and Enhanced Compression with Bayesian Implicit Neural Representations

Jiajun He, Gergely Flamich, Zongyu Guo, José Miguel Hernández-Lobato

TL;DR

The proposed method, Robust and Enhanced COMBINER (RECOMBINER), achieves competitive results with the best INR-based methods and even outperforms autoencoder-based codecs on low-resolution images at low bitrates.

Abstract

COMpression with Bayesian Implicit NEural Representations (COMBINER) is a recent data compression method that addresses a key inefficiency of previous Implicit Neural Representation (INR)-based approaches: it avoids quantization and enables direct optimization of the rate-distortion performance. However, COMBINER still has significant limitations: 1) it uses factorized priors and posterior approximations that lack flexibility; 2) it cannot effectively adapt to local deviations from global patterns in the data; and 3) its performance can be susceptible to modeling choices and the variational parameters' initializations. Our proposed method, Robust and Enhanced COMBINER (RECOMBINER), addresses these issues by 1) enriching the variational approximation while retaining a low computational cost via a linear reparameterization of the INR weights, 2) augmenting our INRs with learnable positional encodings that enable them to adapt to local details and 3) splitting high-resolution data into patches to increase robustness and utilizing expressive hierarchical priors to capture dependency across patches. We conduct extensive experiments across several data modalities, showcasing that RECOMBINER achieves competitive results with the best INR-based methods and even outperforms autoencoder-based codecs on low-resolution images at low bitrates. Our PyTorch implementation is available at https://github.com/cambridge-mlg/RECOMBINER/.

RECOMBINER: Robust and Enhanced Compression with Bayesian Implicit Neural Representations

TL;DR

Abstract

Paper Structure (36 sections, 5 equations, 17 figures, 8 tables, 1 algorithm)

This paper contains 36 sections, 5 equations, 17 figures, 8 tables, 1 algorithm.

Introduction
Background
Methods
Linear Reparameterization for the Network Parameters
Learned Positional Encodings
Scaling To High-Resolution Data with Patches
Extended Training Procedure
Related Works
Experimental Results
Data Compression across Modalities
Effectiveness of Our Solutions, Ablation Studies and Runtime Analysis
Conclusions and Limitations
Acknowledgements
Notations
recombiner's Training Algorithms
...and 21 more sections

Figures (17)

Figure 1: Schematic of (a) combiner and (b) recombiner, our proposed method. See \ref{['sec:background', 'sec:methods']} for notation. As the inr's input, recombiner uses ${\mathbf{h}}_{\mathbf{z}}$ upsampled to pixel-wise positional encodings concatenated with Fourier embeddings. (c) A closer look at how recombiner maps ${\mathbf{h}}_{\mathbf{z}}$ to the inr input, taking images as an example. FE: Fourier embeddings; FC: fully connected layer.
Figure 2: Illustration of (a) the three-level hierarchical model and (b) our permutation strategy.
Figure 3: Quantitive evaluation and qualitative examples of recombiner on image, audio, video, and 3D protein structure. Kbps stands for kilobits per second, RMSD stands for Root Mean Square Deviation, and bpa stands for bits per atom. For all plots, we use solid lines to denote inr-based codecs, dotted lines to denote VAE-based codecs, and dashed lines to denote classical codecs.
Figure 4: Comparison between kodim24 details compressed with and without learnable positional encodings. (a)(b) have similar bitrates and (a)(c) have similar PSNRs.
Figure 5: (a) RD performances of combiner and recombiner with different numbers of hidden units. (b)(c) Ablation studies on CIFAR-10 and Kodak. LR: linear reparameterization; PE: positional encodings; HM: hierarchical model; RP: random permutation across patches. We describe the details of experimental settings for ablation studies in \ref{['appendix:ablation_study_settings']}.
...and 12 more figures

RECOMBINER: Robust and Enhanced Compression with Bayesian Implicit Neural Representations

TL;DR

Abstract

RECOMBINER: Robust and Enhanced Compression with Bayesian Implicit Neural Representations

Authors

TL;DR

Abstract

Table of Contents

Figures (17)