Discrete Bridges for Mutual Information Estimation

Iryna Zabarianska; Sergei Kholkin; Grigoriy Ksenofontov; Ivan Butakov; Alexander Korotin

Discrete Bridges for Mutual Information Estimation

Iryna Zabarianska, Sergei Kholkin, Grigoriy Ksenofontov, Ivan Butakov, Alexander Korotin

TL;DR

The paper tackles discrete mutual information estimation by reframing it as a domain-transfer problem solvable with discrete bridge matching. It introduces DBMI, which computes MI as a KL divergence between reciprocal processes represented as conditional Markov transitions learned via a bridge-matching objective, then estimates MI from these learned transitions. The approach is validated on a low-dimensional synthetic benchmark and a novel high-dimensional image-based discrete benchmark, where DBMI outperforms neural estimators designed for discrete data (e.g., MINE, InfoNCE, NWJ, f-DIME). The simulation-free training and scalable transition-factorization enable accurate MI estimation in complex discrete spaces, with potential impact on information-theoretic analyses and applications involving discrete data.

Abstract

Diffusion bridge models in both continuous and discrete state spaces have recently become powerful tools in the field of generative modeling. In this work, we leverage the discrete state space formulation of bridge matching models to address another important problem in machine learning and information theory: the estimation of the mutual information (MI) between discrete random variables. By neatly framing MI estimation as a domain transfer problem, we construct a Discrete Bridge Mutual Information (DBMI) estimator suitable for discrete data, which poses difficulties for conventional MI estimators. We showcase the performance of our estimator on two MI estimation settings: low-dimensional and image-based.

Discrete Bridges for Mutual Information Estimation

TL;DR

Abstract

Paper Structure (48 sections, 2 theorems, 27 equations, 4 figures, 5 tables, 2 algorithms)

This paper contains 48 sections, 2 theorems, 27 equations, 4 figures, 5 tables, 2 algorithms.

Introduction
Mutual Information (MI)
Bridge Matching.
Contributions.
Notation.
Background
Mutual Information
Reciprocal Processes
Reciprocal Processes Conditioned on a Point
Reciprocal Processes as Conditioned Markov Chains
Bridge Matching for Discrete State Spaces
Related Work
Non-parametric Estimators.
Variational Estimators.
Diffusion Based Estimators.
...and 33 more sections

Key Result

Proposition 1

Consider the reciprocal process conditioned on point $x_0$, $r_{\pi|x_0}(x_{\rm in}, x_1)$. Then $r_{\pi|x_0}(x_{\rm in}, x_1)$ is Markov:

Figures (4)

Figure 1: Qualitative samples, presented in $5\times2$ image grids, generated using image benchmark and DBMI, i.e., $r_{\theta}(\cdot)$, at resolution $32\times32$.
Figure 2: Comparison of estimated mutual information $\hat{\mathop{\mathrm{\mathsf{I}}}\nolimits}(X_0; X_1)$ across methods against the ground-truth $\mathop{\mathrm{\mathsf{I}}}\nolimits(X_0; X_1)$ (red) on a high-dimensional image benchmark with size $32\times32$.
Figure 3: Qualitative samples, presented in $5\times2$ image grids, generated using image benchmark and DBMI, i.e., $r_{\theta}(\cdot)$, at resolution $16\times16$.
Figure 4: Comparison of estimated mutual information $\hat{\mathop{\mathrm{\mathsf{I}}}\nolimits}(X_0; X_1)$ across methods against the ground-truth $\mathop{\mathrm{\mathsf{I}}}\nolimits(X_0; X_1)$ (red) on a high-dimensional image benchmark with size $16\times16$.

Theorems & Definitions (4)

Proposition 1
Proposition 2: Mutual Information Decomposition
proof
proof

Discrete Bridges for Mutual Information Estimation

TL;DR

Abstract

Discrete Bridges for Mutual Information Estimation

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (4)