Efficient and Provably Convergent Computation of Information Bottleneck: A Semi-Relaxed Approach

Lingyi Chen; Shitong Wu; Jiachuan Ye; Huihui Wu; Wenyi Zhang; Hao Wu

Efficient and Provably Convergent Computation of Information Bottleneck: A Semi-Relaxed Approach

Lingyi Chen, Shitong Wu, Jiachuan Ye, Huihui Wu, Wenyi Zhang, Hao Wu

TL;DR

The paper addresses the challenge of computing the relevance-information (RI) function in Information Bottleneck with precision and convergence guarantees. It introduces a semi-relaxed IB model that relaxes the Markov chain and marginal constraints and solves it with an Alternating Bregman Projection (ABP) algorithm that yields closed-form updates and a provable convergence guarantee via descent analysis and Pinsker's inequality. The authors prove equivalence between the semi-relaxed and original RI problems, derive efficient update rules for w, r, and z, and demonstrate convergence to a KKT point. Empirically, ABP outperforms existing methods (BA, GAS, ADMM) in speed while maintaining accuracy, across classical distributions and real data like the Iris dataset, including regimes with phase transitions. The work offers a scalable, reliable approach for IB computations with potential impact on information-theoretic analysis and deep learning applications that rely on accurate IB computations.

Abstract

Information Bottleneck (IB) is a technique to extract information about one target random variable through another relevant random variable. This technique has garnered significant interest due to its broad applications in information theory and deep learning. Hence, there is a strong motivation to develop efficient numerical methods with high precision and theoretical convergence guarantees. In this paper, we propose a semi-relaxed IB model, where the Markov chain and transition probability condition are relaxed from the relevance-compression function. Based on the proposed model, we develop an algorithm, which recovers the relaxed constraints and involves only closed-form iterations. Specifically, the algorithm is obtained by analyzing the Lagrangian of the relaxed model with alternating minimization in each direction. The convergence property of the proposed algorithm is theoretically guaranteed through descent estimation and Pinsker's inequality. Numerical experiments across classical and discrete distributions corroborate the analysis. Moreover, our proposed algorithm demonstrates notable advantages in terms of computational efficiency, evidenced by significantly reduced run times compared to existing methods with comparable accuracy.

Efficient and Provably Convergent Computation of Information Bottleneck: A Semi-Relaxed Approach

TL;DR

Abstract

Paper Structure (16 sections, 6 theorems, 34 equations, 2 figures, 1 table, 2 algorithms)

This paper contains 16 sections, 6 theorems, 34 equations, 2 figures, 1 table, 2 algorithms.

Introduction
Problem Formulation
Semi-Relaxed IB Model
The Alternating Bregman Projection Algorithm
Algorithm Derivation and Implementation
Updating $w$ via its Dual Variables
Updating $\boldsymbol{r}$ via its Dual Variable
Updating $\boldsymbol{z}$ via its Dual Variables
Convergence Analysis
Numerical Results
Accuracy and Efficiency on Classical Distributions
Convergence Behavior and Algorithm Verification
Experiments on Iris Dataset
Conclusion
Formulation and Algorithm for IR Case
...and 1 more sections

Key Result

Theorem 1

The optimal solution to the semi-relaxed IB model SR_RI is exactly that to the original IB model original_RI.

Figures (2)

Figure 1: The convergent trajectories of the residual error for the proposed ABP algorithm. Bernoulli (Left), Gaussian (Right).
Figure 2: Comparison among the ABP, GAS and BA algorithm for a scenario of a real-world dataset in classification task.

Theorems & Definitions (12)

Theorem 1
proof
Lemma 1
proof
Lemma 2
proof
Theorem 2
proof
Theorem 3
proof
...and 2 more

Efficient and Provably Convergent Computation of Information Bottleneck: A Semi-Relaxed Approach

TL;DR

Abstract

Efficient and Provably Convergent Computation of Information Bottleneck: A Semi-Relaxed Approach

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (12)