Table of Contents
Fetching ...

Accelerating Relative Entropy Coding with Space Partitioning

Jiajun He, Gergely Flamich, José Miguel Hernández-Lobato

TL;DR

This work introduces a REC scheme utilizing space partitioning to reduce runtime in practical scenarios, and reduces the bitrate in VAE-based lossless compression on MNIST and INR-based lossy compression on CIFAR-10, significantly improving the practicality of REC for neural compression.

Abstract

Relative entropy coding (REC) algorithms encode a random sample following a target distribution $Q$, using a coding distribution $P$ shared between the sender and receiver. Sadly, general REC algorithms suffer from prohibitive encoding times, at least on the order of $2^{D_{\text{KL}}[Q||P]}$, and faster algorithms are limited to very specific settings. This work addresses this issue by introducing a REC scheme utilizing space partitioning to reduce runtime in practical scenarios. We provide theoretical analyses of our method and demonstrate its effectiveness with both toy examples and practical applications. Notably, our method successfully handles REC tasks with $D_{\text{KL}}[Q||P]$ about three times greater than what previous methods can manage, and reduces the bitrate by approximately 5-15% in VAE-based lossless compression on MNIST and INR-based lossy compression on CIFAR-10, compared to previous methods, significantly improving the practicality of REC for neural compression.

Accelerating Relative Entropy Coding with Space Partitioning

TL;DR

This work introduces a REC scheme utilizing space partitioning to reduce runtime in practical scenarios, and reduces the bitrate in VAE-based lossless compression on MNIST and INR-based lossy compression on CIFAR-10, significantly improving the practicality of REC for neural compression.

Abstract

Relative entropy coding (REC) algorithms encode a random sample following a target distribution , using a coding distribution shared between the sender and receiver. Sadly, general REC algorithms suffer from prohibitive encoding times, at least on the order of , and faster algorithms are limited to very specific settings. This work addresses this issue by introducing a REC scheme utilizing space partitioning to reduce runtime in practical scenarios. We provide theoretical analyses of our method and demonstrate its effectiveness with both toy examples and practical applications. Notably, our method successfully handles REC tasks with about three times greater than what previous methods can manage, and reduces the bitrate by approximately 5-15% in VAE-based lossless compression on MNIST and INR-based lossy compression on CIFAR-10, compared to previous methods, significantly improving the practicality of REC for neural compression.
Paper Structure (40 sections, 10 theorems, 73 equations, 13 figures, 1 table, 5 algorithms)

This paper contains 40 sections, 10 theorems, 73 equations, 13 figures, 1 table, 5 algorithms.

Key Result

Theorem 3.1

Let a pair of correlated random variables ${\mathbf{X}}, {\mathbf{Z}} \sim P_{{\mathbf{X}}, {\mathbf{Z}}}$ be given. Assume we perform relative entropy coding using alg:SP-PFR and let $j^*$ denote the bin index and $\tilde{\mathscr{n}}^*$ the local sample index returned by the algorithm. Then, the e

Figures (13)

  • Figure 1: An illustrative comparison between the standard REC algorithm and REC with space partitioning. We illustrate the prior $P$'s density in blue and $Q$'s density in orange. (a) In a standard REC algorithm, we may draw numerous samples (colored in red) before identifying one that aligns well with $Q$ (colored in green). The majority of these samples do not directly contribute to the desired result. (b) In the method we propose, we first divide the search space into smaller grids and then reweight each grid. This amounts to adjusting the prior $P$ to a search heuristic $P'$, which can align better with $Q$. The samples from $P'$ will thus be more relevant to $Q$, potentially reducing the runtime.
  • Figure 2: Comparing standard PFR and PFR with our proposed space partitioning algorithm on toy examples. Solid lines and the shadow areas represent the mean and IQR.
  • Figure 3: Comparing standard ORC and ORC with our proposed space partitioning algorithm on toy examples.
  • Figure 4: Rate-distortion curve of RECOMBINER by our proposed algorithm and standard ORC. We also provide the theoretical RD curve for an ideal REC algorithm (i.e., assuming we can encode an exact sample in a single block, whose codelength is calculated by \ref{['eq:codelength_standard_rec']}). Notably, our method's performance is already very close to this theoretical result.
  • Figure 5: Elucidating the generality of space partitioning. $\Omega$ represents the original space. We use dashed lines to represent the boundaries of the partitions. (a) We can add an auxiliary axis $x_{\text{aux}}$ to the original space, forming an augmented space. We define the prior and target in the auxiliary axis as $P_\text{aux}$ and $Q_\text{aux}$, and define the prior and target in the augmented space as $P_\text{aug} = P_\text{aux}\times P$, $Q_\text{aug} = Q_\text{aux}\times Q$. We also require $P_\text{aux}$ and $Q_\text{aux}$ to be the same, so that $D_{\text{KL}}[Q_\text{aug}||P_\text{aug}]$ is the same as the original KL $D_{\mathrm{KL}}\infdivx{Q}{P}$. Dividing the augmented space into non-overlapping bins will lead to non-overlapping bins or overlapping bins in the original space. For example, as shown in (b), dividing the augmented space into non-overlapping bins whose boundaries are parallel to the auxiliary axis results in the standard non-overlapping bins in the original space $\Omega$. As shown in (c), dividing the augmented space into non-overlapping bins whose boundaries are orthogonal to the auxiliary axis results in fully overlapping bins in the original space $\Omega$. Also, as in (d), the augmented space can be divided in an arbitrary manner, leading to generally overlapping bins in the original space $\Omega$.
  • ...and 8 more figures

Theorems & Definitions (16)

  • Theorem 3.1
  • Proposition 3.2: Bound of $\epsilon$ for Uniform and Gaussian
  • Theorem 3.3
  • Corollary 4.1: Biasness of sample encoded by ORC
  • Theorem 3.1
  • proof
  • Lemma C.1
  • proof
  • Lemma C.2
  • proof
  • ...and 6 more