Table of Contents
Fetching ...

Efficient Solvers for Wyner Common Information with Application to Multi-Modal Clustering

Teng-Hui Huang, Hesham El Gamal

TL;DR

By separating information sources into bipartite, the proposed Bipartite common information framework has difference-of-convex structure for efficient non-convex optimization and applies to multi-modal clustering without employing ad-hoc clustering algorithms.

Abstract

We propose two novel extensions of the Wyner common information optimization problem. Each relaxes one fundamental constraints in Wyner's formulation. The \textit{Variational Wyner Common Information} relaxes the matching constraint to the known distribution while imposing conditional independence to the feasible solution set. We derive a tight surrogate upper bound of the obtained unconstrained Lagrangian via the theory of variational inference, which can be minimized efficiently. Our solver caters to problems where conditional independence holds with significantly reduced computation complexity; On the other hand, the \textit{Bipartite Wyner Common Information} relaxes the conditional independence constraint whereas the matching condition is enforced on the feasible set. By leveraging the difference-of-convex structure of the formulated optimization problem, we show that our solver is resilient to conditional dependent sources. Both solvers are provably convergent (local stationary points), and empirically, they obtain more accurate solutions to Wyner's formulation with substantially less runtime. Moreover, them can be extended to unknown distribution settings by parameterizing the common randomness as a member of the exponential family of distributions. Our approaches apply to multi-modal clustering problems, where multiple modalities of observations come from the same cluster. Empirically, our solvers outperform the state-of-the-art multi-modal clustering algorithms with significantly improved performance.

Efficient Solvers for Wyner Common Information with Application to Multi-Modal Clustering

TL;DR

By separating information sources into bipartite, the proposed Bipartite common information framework has difference-of-convex structure for efficient non-convex optimization and applies to multi-modal clustering without employing ad-hoc clustering algorithms.

Abstract

We propose two novel extensions of the Wyner common information optimization problem. Each relaxes one fundamental constraints in Wyner's formulation. The \textit{Variational Wyner Common Information} relaxes the matching constraint to the known distribution while imposing conditional independence to the feasible solution set. We derive a tight surrogate upper bound of the obtained unconstrained Lagrangian via the theory of variational inference, which can be minimized efficiently. Our solver caters to problems where conditional independence holds with significantly reduced computation complexity; On the other hand, the \textit{Bipartite Wyner Common Information} relaxes the conditional independence constraint whereas the matching condition is enforced on the feasible set. By leveraging the difference-of-convex structure of the formulated optimization problem, we show that our solver is resilient to conditional dependent sources. Both solvers are provably convergent (local stationary points), and empirically, they obtain more accurate solutions to Wyner's formulation with substantially less runtime. Moreover, them can be extended to unknown distribution settings by parameterizing the common randomness as a member of the exponential family of distributions. Our approaches apply to multi-modal clustering problems, where multiple modalities of observations come from the same cluster. Empirically, our solvers outperform the state-of-the-art multi-modal clustering algorithms with significantly improved performance.
Paper Structure (39 sections, 5 theorems, 1 equation, 6 figures, 4 tables, 2 algorithms)

This paper contains 39 sections, 5 theorems, 1 equation, 6 figures, 4 tables, 2 algorithms.

Key Result

Theorem 1

The sequence $\{P^k_{z|x^V}\}_{k\in\mathbb{N}}$ obtained from the Bipartite solver eq:wynerdca converges to a local stationary point $w^*$.

Figures (6)

  • Figure 1: Block diagrams of the proposed Bipartite and Variational solvers in unknown distribution settings with $V=3$. The solid lines represent the feed-forward flow whereas the dotted lines correspond to flows for correlation optimization modules. The computed sub-objective functions are written in blue.
  • Figure 2: Evaluation results on the information plane. The minimum mutual information $I(X_1,X_2;Z)$ versus the obtained conditional mutual information $I(X_1;X_2|Z)$.
  • Figure 3: Evaluation results on the information plane. The minimum mutual information $I(X_1,X_2,X_3;Z)$ versus the sum of obtained conditional mutual information $\sum_{i\in\{1,2,3\}}I(X_i;X_{[3]\backslash i}|Z)$.
  • Figure 4: Clustering accuracy versus conditional mutual information $I(X_1;X_2|Z)$ of the obtained $P_\theta(Z|X_1,X_2)$ (minimum $I(X_1,X_2;Z)$ achieved) from compared solvers.
  • Figure 5: Clustering accuracy versus conditional mutual information $\sum_{i\in\{1,2,3\}}I(X_i;X_{[3]\backslash i}|Z)$ of the obtained $P_\theta(Z|X_1,X_2,X_3)$ (minimum $I(X_1,X_2,X_3;Z)$ achieved) from compared solvers.
  • ...and 1 more figures

Theorems & Definitions (5)

  • Theorem 1
  • Lemma 2: Lemma 1 huang2023efficient
  • Theorem 3
  • Theorem 4
  • Lemma 5