Table of Contents
Fetching ...

Bregman-divergence-based Arimoto-Blahut algorithm

Masahito Hayashi

TL;DR

This work extends the Arimoto-Blahut algorithm to a broad optimization framework based on Bregman divergences, enabling minimization-free iterations for problems with linear constraints encoded via mixture families. By relating the AB updates to e-projections within a Bregman-divergence system, the authors show equivalence to mirror descent in convex settings while avoiding per-iteration convex minimizations, thereby broadening applicability to classical and quantum rate-distortion theory and EM-algorithms. They develop a minimization-free iteration by carefully choosing the Bregman generator and demonstrate its use in mixture-of-distributions and quantum-state settings, with a numerical RD example illustrating the computation of optimal conditional distributions. The framework unifies information-theoretic optimization across probability and quantum domains, offering convergence guarantees under specified conditions and guiding extensions to non-differentiable or non-probabilistic objective functions. Overall, the paper provides a general, efficient, and versatile optimization toolkit for divergence-based problems in information theory and quantum information.

Abstract

We generalize the generalized Arimoto-Blahut algorithm to a general function defined over Bregman-divergence system. In existing methods, when linear constraints are imposed, each iteration needs to solve a convex minimization. Exploiting our obtained algorithm, we propose a minimization-free-iteration algorithm. This algorithm can be applied to classical and quantum rate-distortion theory. We numerically apply our method to the derivation of the optimal conditional distribution in the rate-distortion theory.

Bregman-divergence-based Arimoto-Blahut algorithm

TL;DR

This work extends the Arimoto-Blahut algorithm to a broad optimization framework based on Bregman divergences, enabling minimization-free iterations for problems with linear constraints encoded via mixture families. By relating the AB updates to e-projections within a Bregman-divergence system, the authors show equivalence to mirror descent in convex settings while avoiding per-iteration convex minimizations, thereby broadening applicability to classical and quantum rate-distortion theory and EM-algorithms. They develop a minimization-free iteration by carefully choosing the Bregman generator and demonstrate its use in mixture-of-distributions and quantum-state settings, with a numerical RD example illustrating the computation of optimal conditional distributions. The framework unifies information-theoretic optimization across probability and quantum domains, offering convergence guarantees under specified conditions and guiding extensions to non-differentiable or non-probabilistic objective functions. Overall, the paper provides a general, efficient, and versatile optimization toolkit for divergence-based problems in information theory and quantum information.

Abstract

We generalize the generalized Arimoto-Blahut algorithm to a general function defined over Bregman-divergence system. In existing methods, when linear constraints are imposed, each iteration needs to solve a convex minimization. Exploiting our obtained algorithm, we propose a minimization-free-iteration algorithm. This algorithm can be applied to classical and quantum rate-distortion theory. We numerically apply our method to the derivation of the optimal conditional distribution in the rate-distortion theory.
Paper Structure (22 sections, 10 theorems, 108 equations, 1 figure, 4 tables, 5 algorithms)

This paper contains 22 sections, 10 theorems, 108 equations, 1 figure, 4 tables, 5 algorithms.

Key Result

Theorem 1

When all pairs $(\theta^{[t]},\theta^{[t+1]})$ satisfy the following condition with $(\theta,\theta')=(\theta^{[t]},\theta^{[t+1]})$ for some sufficiently large positive number $\gamma$, Algorithm AL1 always iteratively improves the value of the objective function.

Figures (1)

  • Figure 1: Behavior of $\tilde{\cal G}(\theta^{[t]})-\tilde{\cal G}(\theta^{[\infty]})$ of the minimum mutual information. Vertical axis shows the value of $\tilde{\cal G}(\theta^{[t]})-\tilde{\cal G}(\theta^{[\infty]})$. The horizontal axis shows the number of iterations. The red points show the number of iterations of the calculation \ref{['ITE2']} in Algorithm \ref{['protocol1-3']} with $f_1$. The purple points show the same number in Algorithm \ref{['protocol1-3']} with $f_2$

Theorems & Definitions (19)

  • Definition 1: Bregman divergence
  • Theorem 1
  • Theorem 2
  • Example 1
  • Example 2
  • Lemma 3
  • proof
  • Lemma 4
  • Lemma 5: HSF1
  • Theorem 6
  • ...and 9 more