Table of Contents
Fetching ...

Learning Gaussian DAG Models without Condition Number Bounds

Constantinos Daskalakis, Vardis Kandiros, Rui Yao

TL;DR

This work proves that learning the topology of linear Gaussian DAGs under equal-variance noise can be information-theoretically and computationally feasible without dependence on the covariance condition number. It introduces a graph-specific quantity $\tau(G)$ that governs sample complexity, achieving $m \asymp O\bigl(\max(b_{\min}^{-4}, \tau b_{\min}^{-2}) d\log(n/d)\bigr)$ samples for information-theoretic recovery, and shows a near-tight lower bound up to a factor of $d$. An efficient, polynomial-time algorithm under a bounded-variance assumption uses LASSO-based methods with a sample complexity of $m \asymp O\bigl(R^2 \tau^3 d^4 b_{\min}^{-4} \log(n)\bigr)$, highlighting a practical route when variances are bounded. Together, these results separate directed and undirected Gaussian learning in terms of the governing complexity terms and demonstrate the practical impact of a condition-number-free approach on high-dimensional causal structure learning.

Abstract

We study the problem of learning the topology of a directed Gaussian Graphical Model under the equal-variance assumption, where the graph has $n$ nodes and maximum in-degree $d$. Prior work has established that $O(d \log n)$ samples are sufficient for this task. However, an important factor that is often overlooked in these analyses is the dependence on the condition number of the covariance matrix of the model. Indeed, all algorithms from prior work require a number of samples that grows polynomially with this condition number. In many cases this is unsatisfactory, since the condition number could grow polynomially with $n$, rendering these prior approaches impractical in high-dimensional settings. In this work, we provide an algorithm that recovers the underlying graph and prove that the number of samples required is independent of the condition number. Furthermore, we establish lower bounds that nearly match the upper bound up to a $d$-factor, thus providing an almost tight characterization of the true sample complexity of the problem. Moreover, under a further assumption that all the variances of the variables are bounded, we design a polynomial-time algorithm that recovers the underlying graph, at the cost of an additional polynomial dependence of the sample complexity on $d$. We complement our theoretical findings with simulations on synthetic datasets that confirm our predictions.

Learning Gaussian DAG Models without Condition Number Bounds

TL;DR

This work proves that learning the topology of linear Gaussian DAGs under equal-variance noise can be information-theoretically and computationally feasible without dependence on the covariance condition number. It introduces a graph-specific quantity that governs sample complexity, achieving samples for information-theoretic recovery, and shows a near-tight lower bound up to a factor of . An efficient, polynomial-time algorithm under a bounded-variance assumption uses LASSO-based methods with a sample complexity of , highlighting a practical route when variances are bounded. Together, these results separate directed and undirected Gaussian learning in terms of the governing complexity terms and demonstrate the practical impact of a condition-number-free approach on high-dimensional causal structure learning.

Abstract

We study the problem of learning the topology of a directed Gaussian Graphical Model under the equal-variance assumption, where the graph has nodes and maximum in-degree . Prior work has established that samples are sufficient for this task. However, an important factor that is often overlooked in these analyses is the dependence on the condition number of the covariance matrix of the model. Indeed, all algorithms from prior work require a number of samples that grows polynomially with this condition number. In many cases this is unsatisfactory, since the condition number could grow polynomially with , rendering these prior approaches impractical in high-dimensional settings. In this work, we provide an algorithm that recovers the underlying graph and prove that the number of samples required is independent of the condition number. Furthermore, we establish lower bounds that nearly match the upper bound up to a -factor, thus providing an almost tight characterization of the true sample complexity of the problem. Moreover, under a further assumption that all the variances of the variables are bounded, we design a polynomial-time algorithm that recovers the underlying graph, at the cost of an additional polynomial dependence of the sample complexity on . We complement our theoretical findings with simulations on synthetic datasets that confirm our predictions.

Paper Structure

This paper contains 41 sections, 20 theorems, 86 equations, 8 figures, 4 algorithms.

Key Result

Theorem 2.1

Suppose we run Algorithm alg: inefficient using $m$ independent samples generated from a DAG $G$ according to eqn: model. If Assumption ass:beta_lower_bound holds, then there exists an absolute constant $C>0$, such that the following guarantees hold.

Figures (8)

  • Figure 1: Growth of condition number, $\tau$ and maximum variance as $n$ grows, averaged over $1000$ random graphs with same $d=4,b_{\min}=0.5,b_{\max}=1$.
  • Figure 2: Schematic graph for the ensemble. Every $G_i$ for $j \geq 1$ is obtained from $G_0$ by reversing one edge in the matching of model $G_0$, e.g. $G_1$ is just reversing $Y_1\to Z_1$ in $G_0$.
  • Figure 3: Schematic graph for the ensemble. Every model $G_i$ with $i \geq 1$ is obtained from $G_0$ by switching one triangle to a chain, i.e. $G_1$ is just changing the triangle $X_1Y_1Z_1$ to a chain $X_1\to Y_1\to Z_1$.
  • Figure 4: Comparison with the PC algorithm
  • Figure 5: Figure for $b_{\max}=1$
  • ...and 3 more figures

Theorems & Definitions (27)

  • Theorem 2.1
  • Lemma 2.2
  • Theorem 2.3
  • Lemma 2.5
  • Theorem 2.6
  • Lemma 3.1
  • Proposition A.1
  • proof
  • Lemma A.2
  • Lemma A.3
  • ...and 17 more