Table of Contents
Fetching ...

Efficient Sample-optimal Learning of Gaussian Tree Models via Sample-optimal Testing of Gaussian Mutual Information

Sutanu Gayen, Sanket Kale, Sayantan Sen

TL;DR

This work designs a conditional mutual information tester for Gaussian random variables that can test whether two Gaussian random variables are independent, or their conditional mutual information is at least $\varepsilon", and gives an efficient $\varepsilon$-approximate structure-learning algorithm for an $n$-variate Gaussian tree model that shows to be near-optimal.

Abstract

Learning high-dimensional distributions is a significant challenge in machine learning and statistics. Classical research has mostly concentrated on asymptotic analysis of such data under suitable assumptions. While existing works [Bhattacharyya et al.: SICOMP 2023, Daskalakis et al.: STOC 2021, Choo et al.: ALT 2024] focus on discrete distributions, the current work addresses the tree structure learning problem for Gaussian distributions, providing efficient algorithms with solid theoretical guarantees. This is crucial as real-world distributions are often continuous and differ from the discrete scenarios studied in prior works. In this work, we design a conditional mutual information tester for Gaussian random variables that can test whether two Gaussian random variables are independent, or their conditional mutual information is at least $\varepsilon$, for some parameter $\varepsilon \in (0,1)$ using $\mathcal{O}(\varepsilon^{-1})$ samples which we show to be near-optimal. In contrast, an additive estimation would require $Ω(\varepsilon^{-2})$ samples. Our upper bound technique uses linear regression on a pair of suitably transformed random variables. Importantly, we show that the chain rule of conditional mutual information continues to hold for the estimated (conditional) mutual information. As an application of such a mutual information tester, we give an efficient $\varepsilon$-approximate structure-learning algorithm for an $n$-variate Gaussian tree model that takes $\widetildeΘ(n\varepsilon^{-1})$ samples which we again show to be near-optimal. In contrast, when the underlying Gaussian model is not known to be tree-structured, we show that $\widetilde{Θ}(n^2\varepsilon^{-2})$ samples are necessary and sufficient to output an $\varepsilon$-approximate tree structure. We perform extensive experiments that corroborate our theoretical convergence bounds.

Efficient Sample-optimal Learning of Gaussian Tree Models via Sample-optimal Testing of Gaussian Mutual Information

TL;DR

This work designs a conditional mutual information tester for Gaussian random variables that can test whether two Gaussian random variables are independent, or their conditional mutual information is at least \varepsilonn$-variate Gaussian tree model that shows to be near-optimal.

Abstract

Learning high-dimensional distributions is a significant challenge in machine learning and statistics. Classical research has mostly concentrated on asymptotic analysis of such data under suitable assumptions. While existing works [Bhattacharyya et al.: SICOMP 2023, Daskalakis et al.: STOC 2021, Choo et al.: ALT 2024] focus on discrete distributions, the current work addresses the tree structure learning problem for Gaussian distributions, providing efficient algorithms with solid theoretical guarantees. This is crucial as real-world distributions are often continuous and differ from the discrete scenarios studied in prior works. In this work, we design a conditional mutual information tester for Gaussian random variables that can test whether two Gaussian random variables are independent, or their conditional mutual information is at least , for some parameter using samples which we show to be near-optimal. In contrast, an additive estimation would require samples. Our upper bound technique uses linear regression on a pair of suitably transformed random variables. Importantly, we show that the chain rule of conditional mutual information continues to hold for the estimated (conditional) mutual information. As an application of such a mutual information tester, we give an efficient -approximate structure-learning algorithm for an -variate Gaussian tree model that takes samples which we again show to be near-optimal. In contrast, when the underlying Gaussian model is not known to be tree-structured, we show that samples are necessary and sufficient to output an -approximate tree structure. We perform extensive experiments that corroborate our theoretical convergence bounds.

Paper Structure

This paper contains 50 sections, 32 theorems, 127 equations, 16 figures, 2 algorithms.

Key Result

Lemma 1.0

Let $X$ and $Y$ be two real-valued Gaussian random variables, and $\varepsilon, \delta \in (0,1)$. If we take $\mathcal{O}(1/\varepsilon^2 + \log 1/\delta)$ samples, with probability at least $1-\delta$, $\left| \widehat{I}(X;Y)-I(X;Y) \right| \leq \varepsilon$ holds, where $\widehat{I}(X;Y)$ is the

Figures (16)

  • Figure 1: Proximity parameter vs No. of samples for the realizable case
  • Figure 2: $\varepsilon$ vs $m^*$ linear regression for realizable case
  • Figure 3: Proximity parameter vs No. of samples for the non-realizable case
  • Figure 4: $\varepsilon$ vs $m^*$ linear regression for non-realizable case
  • Figure 5: Chow-Liu vs GLASSO for $\varepsilon = 0.1$
  • ...and 11 more figures

Theorems & Definitions (65)

  • Lemma 1.0
  • Lemma 1.0
  • Lemma 1.0
  • Lemma 1.0
  • Theorem 1.1
  • Theorem 1.2
  • Theorem 1.3
  • Theorem 1.4
  • Theorem 1.5
  • Definition 2.1: Positive semi-definite matrix
  • ...and 55 more