Table of Contents
Fetching ...

Efficient and Provably Convergent Computation of Information Bottleneck: A Semi-Relaxed Approach

Lingyi Chen, Shitong Wu, Jiachuan Ye, Huihui Wu, Wenyi Zhang, Hao Wu

TL;DR

The paper addresses the challenge of computing the relevance-information (RI) function in Information Bottleneck with precision and convergence guarantees. It introduces a semi-relaxed IB model that relaxes the Markov chain and marginal constraints and solves it with an Alternating Bregman Projection (ABP) algorithm that yields closed-form updates and a provable convergence guarantee via descent analysis and Pinsker's inequality. The authors prove equivalence between the semi-relaxed and original RI problems, derive efficient update rules for w, r, and z, and demonstrate convergence to a KKT point. Empirically, ABP outperforms existing methods (BA, GAS, ADMM) in speed while maintaining accuracy, across classical distributions and real data like the Iris dataset, including regimes with phase transitions. The work offers a scalable, reliable approach for IB computations with potential impact on information-theoretic analysis and deep learning applications that rely on accurate IB computations.

Abstract

Information Bottleneck (IB) is a technique to extract information about one target random variable through another relevant random variable. This technique has garnered significant interest due to its broad applications in information theory and deep learning. Hence, there is a strong motivation to develop efficient numerical methods with high precision and theoretical convergence guarantees. In this paper, we propose a semi-relaxed IB model, where the Markov chain and transition probability condition are relaxed from the relevance-compression function. Based on the proposed model, we develop an algorithm, which recovers the relaxed constraints and involves only closed-form iterations. Specifically, the algorithm is obtained by analyzing the Lagrangian of the relaxed model with alternating minimization in each direction. The convergence property of the proposed algorithm is theoretically guaranteed through descent estimation and Pinsker's inequality. Numerical experiments across classical and discrete distributions corroborate the analysis. Moreover, our proposed algorithm demonstrates notable advantages in terms of computational efficiency, evidenced by significantly reduced run times compared to existing methods with comparable accuracy.

Efficient and Provably Convergent Computation of Information Bottleneck: A Semi-Relaxed Approach

TL;DR

The paper addresses the challenge of computing the relevance-information (RI) function in Information Bottleneck with precision and convergence guarantees. It introduces a semi-relaxed IB model that relaxes the Markov chain and marginal constraints and solves it with an Alternating Bregman Projection (ABP) algorithm that yields closed-form updates and a provable convergence guarantee via descent analysis and Pinsker's inequality. The authors prove equivalence between the semi-relaxed and original RI problems, derive efficient update rules for w, r, and z, and demonstrate convergence to a KKT point. Empirically, ABP outperforms existing methods (BA, GAS, ADMM) in speed while maintaining accuracy, across classical distributions and real data like the Iris dataset, including regimes with phase transitions. The work offers a scalable, reliable approach for IB computations with potential impact on information-theoretic analysis and deep learning applications that rely on accurate IB computations.

Abstract

Information Bottleneck (IB) is a technique to extract information about one target random variable through another relevant random variable. This technique has garnered significant interest due to its broad applications in information theory and deep learning. Hence, there is a strong motivation to develop efficient numerical methods with high precision and theoretical convergence guarantees. In this paper, we propose a semi-relaxed IB model, where the Markov chain and transition probability condition are relaxed from the relevance-compression function. Based on the proposed model, we develop an algorithm, which recovers the relaxed constraints and involves only closed-form iterations. Specifically, the algorithm is obtained by analyzing the Lagrangian of the relaxed model with alternating minimization in each direction. The convergence property of the proposed algorithm is theoretically guaranteed through descent estimation and Pinsker's inequality. Numerical experiments across classical and discrete distributions corroborate the analysis. Moreover, our proposed algorithm demonstrates notable advantages in terms of computational efficiency, evidenced by significantly reduced run times compared to existing methods with comparable accuracy.
Paper Structure (16 sections, 6 theorems, 34 equations, 2 figures, 1 table, 2 algorithms)

This paper contains 16 sections, 6 theorems, 34 equations, 2 figures, 1 table, 2 algorithms.

Key Result

Theorem 1

The optimal solution to the semi-relaxed IB model SR_RI is exactly that to the original IB model original_RI.

Figures (2)

  • Figure 1: The convergent trajectories of the residual error for the proposed ABP algorithm. Bernoulli (Left), Gaussian (Right).
  • Figure 2: Comparison among the ABP, GAS and BA algorithm for a scenario of a real-world dataset in classification task.

Theorems & Definitions (12)

  • Theorem 1
  • proof
  • Lemma 1
  • proof
  • Lemma 2
  • proof
  • Theorem 2
  • proof
  • Theorem 3
  • proof
  • ...and 2 more