Table of Contents
Fetching ...

Finite Block Length Rate-Distortion Theory for the Bernoulli Source with Hamming Distortion: A Tutorial

Bhaskar Krishnamachari

TL;DR

This work derives the classical rate-distortion function $RD = Hp - HD$ from first principles, illustrates its computation via the Blahut-Arimoto algorithm, and develops the finite block length refinements that characterize how the minimum achievable rate approaches the Shannon limit as the block length grows.

Abstract

Lossy data compression lies at the heart of modern communication and storage systems. Shannon's rate-distortion theory provides the fundamental limit on how much a source can be compressed at a given fidelity, but it assumes infinitely long block lengths that are never realized in practice. We present a self-contained tutorial on rate-distortion theory for the simplest non-trivial source: a Bernoulli$(p)$ sequence with Hamming distortion. We derive the classical rate-distortion function $RD = Hp - HD$ from first principles, illustrate its computation via the Blahut-Arimoto algorithm, and then develop the finite block length refinements that characterize how the minimum achievable rate approaches the Shannon limit as the block length $n$ grows. The central quantity in this refinement is the \emph{rate-distortion dispersion} $V(D)$, which governs the $O(1/\sqrt{n})$ penalty for operating at finite block lengths. We accompany all theoretical developments with numerical examples and figures generated by accompanying Python scripts.

Finite Block Length Rate-Distortion Theory for the Bernoulli Source with Hamming Distortion: A Tutorial

TL;DR

This work derives the classical rate-distortion function from first principles, illustrates its computation via the Blahut-Arimoto algorithm, and develops the finite block length refinements that characterize how the minimum achievable rate approaches the Shannon limit as the block length grows.

Abstract

Lossy data compression lies at the heart of modern communication and storage systems. Shannon's rate-distortion theory provides the fundamental limit on how much a source can be compressed at a given fidelity, but it assumes infinitely long block lengths that are never realized in practice. We present a self-contained tutorial on rate-distortion theory for the simplest non-trivial source: a Bernoulli sequence with Hamming distortion. We derive the classical rate-distortion function from first principles, illustrate its computation via the Blahut-Arimoto algorithm, and then develop the finite block length refinements that characterize how the minimum achievable rate approaches the Shannon limit as the block length grows. The central quantity in this refinement is the \emph{rate-distortion dispersion} , which governs the penalty for operating at finite block lengths. We accompany all theoretical developments with numerical examples and figures generated by accompanying Python scripts.
Paper Structure (34 sections, 1 theorem, 62 equations, 12 figures, 1 algorithm)

This paper contains 34 sections, 1 theorem, 62 equations, 12 figures, 1 algorithm.

Key Result

Theorem 6.1

For a discrete memoryless source with rate-distortion function $R(D)$ and dispersion $V(D) > 0$, the minimum rate at block length $n$ and excess-distortion probability $\varepsilon \in (0, 1)$ satisfies where $Q^{-1}(\varepsilon)$ is the inverse of the Gaussian $Q$-function, $Q(x) = \frac{1}{\sqrt{2\pi}} \int_x^{\infty} e^{-t^2/2}\, dt$.

Figures (12)

  • Figure 1: The binary entropy function $H(p)$ versus the source bias $p$. The entropy is maximized at $p = 1/2$, where each bit carries one full bit of information, and vanishes at $p \in \{0, 1\}$, where the source is deterministic.
  • Figure 2: Top: the operational lossy compression setup with encoder and decoder. Bottom: the test channel $p_{\hat{X}|X}$ that abstracts away the codebook structure. The rate-distortion function minimizes mutual information $I(X;\hat{X})$ over all test channels satisfying the distortion constraint.
  • Figure 3: The rate-distortion function $R(D) = H(p) - H(D)$ for a $\mathrm{Bernoulli}(p)$ source with Hamming distortion, shown for $p \in \{0.11, 0.2, 0.3, 0.5\}$. Each curve is convex and decreasing, starting at $R(0) = H(p)$ and reaching zero at $D = \min(p, 1-p)$. The $p = 0.5$ curve starts highest because the fair coin has the most entropy.
  • Figure 4: Convergence of the Blahut-Arimoto algorithm for $p = 0.3$ and slope parameters $s \in \{2, 5, 10, 20\}$. The rate converges monotonically to its final value within a few tens of iterations.
  • Figure 5: Comparison of the Blahut-Arimoto computed rate-distortion points (circles) with the closed-form curve $R(D) = H(p) - H(D)$ (solid line) for $p = 0.3$. The agreement is exact to numerical precision.
  • ...and 7 more figures

Theorems & Definitions (13)

  • Definition 2.1: Binary Entropy
  • Definition 2.2: Mutual Information
  • Definition 3.1: Rate-Distortion Function
  • Remark 4.1
  • Remark 4.2
  • Remark 5.1
  • Remark 5.2: Intuition for Step 1
  • Definition 6.1: $(n, M, D, \varepsilon)$ Code
  • Definition 6.2: $d$-Tilted Information kostina2012
  • proof
  • ...and 3 more