Table of Contents
Fetching ...

Channel Simulation and Distributed Compression with Ensemble Rejection Sampling

Buu Phan, Ashish Khisti

TL;DR

This work studies channel simulation and distributed matching through the Ensemble Rejection Sampling (ERS) framework. ERS combines rejection sampling with importance sampling to achieve near-optimal coding costs for channel simulation while boosting distributed matching probabilities to approach Poisson Matching Lemma (PML) performance, even when the decoder learns the target distribution (e.g., via machine learning). The authors establish concrete coding bounds for RS and ERS, quantify matching probabilities with and without batch communication, and extend these results to lossy compression with side information (Wyner-Ziv). Empirical results on synthetic Gaussian sources and MNIST (and CIFAR-10 variants) demonstrate that ERS yields competitive rate-distortion performance and unbiased sampling, illustrating practical applicability in distributed compression and learning-driven settings.

Abstract

We study channel simulation and distributed matching, two fundamental problems with several applications to machine learning, using a recently introduced generalization of the standard rejection sampling (RS) algorithm known as Ensemble Rejection Sampling (ERS). For channel simulation, we propose a new coding scheme based on ERS that achieves a near-optimal coding rate. In this process, we demonstrate that standard RS can also achieve a near-optimal coding rate and generalize the result of Braverman and Garg (2014) to the continuous alphabet setting. Next, as our main contribution, we present a distributed matching lemma for ERS, which serves as the rejection sampling counterpart to the Poisson Matching Lemma (PML) introduced by Li and Anantharam (2021). Our result also generalizes a recent work on importance matching lemma (Phan et al, 2024) and, to our knowledge, is the first result on distributed matching in the family of rejection sampling schemes where the matching probability is close to PML. We demonstrate the practical significance of our approach over prior works by applying it to distributed compression. The effectiveness of our proposed scheme is validated through experiments involving synthetic Gaussian sources and distributed image compression using the MNIST dataset.

Channel Simulation and Distributed Compression with Ensemble Rejection Sampling

TL;DR

This work studies channel simulation and distributed matching through the Ensemble Rejection Sampling (ERS) framework. ERS combines rejection sampling with importance sampling to achieve near-optimal coding costs for channel simulation while boosting distributed matching probabilities to approach Poisson Matching Lemma (PML) performance, even when the decoder learns the target distribution (e.g., via machine learning). The authors establish concrete coding bounds for RS and ERS, quantify matching probabilities with and without batch communication, and extend these results to lossy compression with side information (Wyner-Ziv). Empirical results on synthetic Gaussian sources and MNIST (and CIFAR-10 variants) demonstrate that ERS yields competitive rate-distortion performance and unbiased sampling, illustrating practical applicability in distributed compression and learning-driven settings.

Abstract

We study channel simulation and distributed matching, two fundamental problems with several applications to machine learning, using a recently introduced generalization of the standard rejection sampling (RS) algorithm known as Ensemble Rejection Sampling (ERS). For channel simulation, we propose a new coding scheme based on ERS that achieves a near-optimal coding rate. In this process, we demonstrate that standard RS can also achieve a near-optimal coding rate and generalize the result of Braverman and Garg (2014) to the continuous alphabet setting. Next, as our main contribution, we present a distributed matching lemma for ERS, which serves as the rejection sampling counterpart to the Poisson Matching Lemma (PML) introduced by Li and Anantharam (2021). Our result also generalizes a recent work on importance matching lemma (Phan et al, 2024) and, to our knowledge, is the first result on distributed matching in the family of rejection sampling schemes where the matching probability is close to PML. We demonstrate the practical significance of our approach over prior works by applying it to distributed compression. The effectiveness of our proposed scheme is validated through experiments involving synthetic Gaussian sources and distributed image compression using the MNIST dataset.

Paper Structure

This paper contains 59 sections, 11 theorems, 197 equations, 9 figures, 5 tables, 2 algorithms.

Key Result

Proposition 4.1

Given $(X,Y)\sim P_{X,Y}$ and $K$ defined as above. Then we have: Proof: See Appendix rej_coding_final.

Figures (9)

  • Figure 1: Left: Channel simulation setup. Middle: Distributed matching without communication. Right: Distributed matching with communication where the decoder's input $Z \sim P_{Z|X,Y_A}$ represents side information and/or messages from the encoder.
  • Figure 2: Left: Visualization of our Sorting Method for Standard RS. Right: Empirical results comparing $\mathbb{E}[\log(L)]$ and $\mathbb{E}[\log(\hat{K})]$ with their associated theoretical upper-bound across different target distribution. We use $P_Y(.)=\mathcal{N}(0,1.0)$ and $P_{Y|X}(.|x) = \mathcal{N}(1.0, \sigma^2)$ where $\sigma^2 \in [0.01, 0.1]$.
  • Figure 3: Left: Illustration of ERS Selection Method. Middle: Coding scheme for channel simulation. Right: Empirical results on the coding cost of $\hat{K}_1, \hat{K}_2$ and their theoretical upper-bound (in bits). Both figures use $P_Y(.){=}\mathcal{N}(0,1.0)$, where the first figure sets $N=32$ and varies $P_{Y|X}(.|x) {=}\mathcal{N}(1.0, \sigma^2)$ with $\sigma^2 \in [0.1, 5 ]{\times} 10^{-3}$. The second one fixes $\sigma^2{=}10^{-3}$ while varying $N$.
  • Figure 4: (Best viewed in color) We set $Q_Y{=}\mathcal{N}(0,100)$, $P^A_Y{=}\mathcal{N}(0.5, 0.7)$ and $P^B_Y{=}\mathcal{N}(-0.5,0.7)$. Left: Matching probabilities versus the batch size $N$. Middle: Matching probabilities versus the average number of proposals where the red and black dotted lines correspond to the batch sizes $\omega$ and $4\omega$ shown in the left figure. Right: Sample quality of IS, measured by the estimated variance $\hat{\sigma}^2$.
  • Figure 5: Left: Comparison of RD performance between different matching results for the Gaussian setting when targeting $-23\mathrm{dB}$ distortion (black dotted line), with the average number of proposals $N^*\in \{1.1\mathrm{e}6,1.6\mathrm{e}6\}$. Right: RD curves of different methods. Each group targets the same distortion levels and uses the same average number of proposals $N^*$ for ERS and IML, shown in the right table.
  • ...and 4 more figures

Theorems & Definitions (24)

  • Remark 3.1
  • Definition 3.2
  • Proposition 4.1
  • Remark 5.1
  • Proposition 5.2
  • Remark 5.3
  • Proposition 5.4
  • Remark 5.5
  • Proposition 5.6
  • Remark 5.7
  • ...and 14 more