Table of Contents
Fetching ...

Distributed Compression in the Era of Machine Learning: A Review of Recent Advances

Ezgi Ozyilkan, Elza Erkip

TL;DR

The paper surveys the rise of learning-based approaches to distributed compression, linking neural transform coding techniques to classical information-theoretic results such as Slepian--Wolf and Wyner--Ziv. It analyzes learned data compression for point-to-point and distributed settings, including real-life image sources and abstract distributions, and highlights methods that recover or approximate optimal binning strategies under decoder-side information. A key theme is that neural compressors can achieve competitive rate–distortion performance and yield interpretable mechanisms, while still lacking comprehensive theoretical guarantees in fully distributed scenarios. The discussion points to open challenges in robustness, scalability, and practical deployment in multi-view and cooperative communication networks, outlining directions for future research.

Abstract

Many applications from camera arrays to sensor networks require efficient compression and processing of correlated data, which in general is collected in a distributed fashion. While information-theoretic foundations of distributed compression are well investigated, the impact of theory in practice-oriented applications to this day has been somewhat limited. As the field of data compression is undergoing a transformation with the emergence of learning-based techniques, machine learning is becoming an important tool to reap the long-promised benefits of distributed compression. In this paper, we review the recent contributions in the broad area of learned distributed compression techniques for abstract sources and images. In particular, we discuss approaches that provide interpretable results operating close to information-theoretic bounds. We also highlight unresolved research challenges, aiming to inspire fresh interest and advancements in the field of learned distributed compression.

Distributed Compression in the Era of Machine Learning: A Review of Recent Advances

TL;DR

The paper surveys the rise of learning-based approaches to distributed compression, linking neural transform coding techniques to classical information-theoretic results such as Slepian--Wolf and Wyner--Ziv. It analyzes learned data compression for point-to-point and distributed settings, including real-life image sources and abstract distributions, and highlights methods that recover or approximate optimal binning strategies under decoder-side information. A key theme is that neural compressors can achieve competitive rate–distortion performance and yield interpretable mechanisms, while still lacking comprehensive theoretical guarantees in fully distributed scenarios. The discussion points to open challenges in robustness, scalability, and practical deployment in multi-view and cooperative communication networks, outlining directions for future research.

Abstract

Many applications from camera arrays to sensor networks require efficient compression and processing of correlated data, which in general is collected in a distributed fashion. While information-theoretic foundations of distributed compression are well investigated, the impact of theory in practice-oriented applications to this day has been somewhat limited. As the field of data compression is undergoing a transformation with the emergence of learning-based techniques, machine learning is becoming an important tool to reap the long-promised benefits of distributed compression. In this paper, we review the recent contributions in the broad area of learned distributed compression techniques for abstract sources and images. In particular, we discuss approaches that provide interpretable results operating close to information-theoretic bounds. We also highlight unresolved research challenges, aiming to inspire fresh interest and advancements in the field of learned distributed compression.
Paper Structure (8 sections, 2 theorems, 3 equations, 3 figures)

This paper contains 8 sections, 2 theorems, 3 equations, 3 figures.

Key Result

Theorem 1

(Slepian--Wolf Theorem [1973]) The optimal rate region for distributed lossless source coding of a pair of discrete memoryless sources $(X,Y)$ is the set of pairs $(R_{X}, R_{Y})$ such that:

Figures (3)

  • Figure 1: Distributed lossless compression with two sources, also known as Slepian--Wolf coding.
  • Figure 2: Rate--distortion with side information, also known as Wyner--Ziv coding.
  • Figure 3: Figure taken from ozyilkan2023learned. Visualization (best viewed in color) of the learned encoder and decoder of the operational scheme proposed in ozyilkan2023learned, for the quadratic-Gaussian Wyner--Ziv setup. The dashed horizontal lines are quantization boundaries, and the colors between boundaries represent unique values of bin indices. The decoding function is depicted as separate plots for each value of bin index, using the same color assignment. Color coding reveals that the model learns discontiguous quantization bins.

Theorems & Definitions (2)

  • Theorem 1
  • Theorem 2