Table of Contents
Fetching ...

Causal Discovery under Latent Class Confounding

Bijan Mazaheri, Spencer Gordon, Yuval Rabani, Leonard Schulman

TL;DR

It is demonstrated that globally confounded causal structures can still be identifiable with arbitrary structural equations and noise functions, so long as the number of latent classes remains small relative to the size and sparsity of the underlying DAG.

Abstract

An acyclic causal structure can be described with directed acyclic graph (DAG), where arrows indicate the possibility of direct causation. The task of learning this structure from data is known as "causal discovery." Diverse populations or changing environments can sometimes give rise to data that is heterogeneous in the following sense: each population/environment is a "source" which idiosyncratically determines the forms of those direct causal effects. From this perspective, the source is a latent common cause for every observed variable. While some methods for causal discovery are able to work around latent confounding in special cases, especially when only few observables are confounded, a global confounder is a difficult challenge. The only known ways to deal with latent global confounding involve assumptions that limit the structural equations and/or noise functions. We demonstrate that globally confounded causal structures can still be identifiable with arbitrary structural equations and noise functions, so long as the number of latent classes remains small relative to the size and sparsity of the underlying DAG.

Causal Discovery under Latent Class Confounding

TL;DR

It is demonstrated that globally confounded causal structures can still be identifiable with arbitrary structural equations and noise functions, so long as the number of latent classes remains small relative to the size and sparsity of the underlying DAG.

Abstract

An acyclic causal structure can be described with directed acyclic graph (DAG), where arrows indicate the possibility of direct causation. The task of learning this structure from data is known as "causal discovery." Diverse populations or changing environments can sometimes give rise to data that is heterogeneous in the following sense: each population/environment is a "source" which idiosyncratically determines the forms of those direct causal effects. From this perspective, the source is a latent common cause for every observed variable. While some methods for causal discovery are able to work around latent confounding in special cases, especially when only few observables are confounded, a global confounder is a difficult challenge. The only known ways to deal with latent global confounding involve assumptions that limit the structural equations and/or noise functions. We demonstrate that globally confounded causal structures can still be identifiable with arbitrary structural equations and noise functions, so long as the number of latent classes remains small relative to the size and sparsity of the underlying DAG.
Paper Structure (40 sections, 19 theorems, 20 equations, 6 figures, 3 algorithms)

This paper contains 40 sections, 19 theorems, 20 equations, 6 figures, 3 algorithms.

Key Result

Theorem 1

Consider $\mathcal{G} = (\bm{V}, \bm{E})$ with $\Omega(\log(k) \Delta^3)$ vertices$\Omega$ is asymptotic notation., mixture source $U\in \{1, \ldots, k\}$ and degree bound $\Delta$. $\mathcal{G}$ is genericallyLebesgue measure 1 in the space of parameters. identifiable up to its Markov equivalence c

Figures (6)

  • Figure 1: The goal is to learn the graph structure $\mathcal{G}$without observing $U$.
  • Figure 2: An FP edge after Phase I due to a large set of immoral descendants. The population variable $U$ is omitted to avoid clutter. While $V_i$ and $V_j$ are d-separated by $\mathcal{C} = \emptyset$ no IPA can be made because all of the leftover vertices are immoral descendants.
  • Figure 3: The results of Test 1.
  • Figure 4: Results from Test 2. Correctly returned edges are blue. Red edges are not in the true model. Opacity shows how frequently the edge is in our returned model (we want faint red lines and strong blue lines). These frequencies are given in a table using the same scheme.
  • Figure 5: The results of Test 3. Blue gives the percentage of edges that are correctly identified and orange gives the percentage of missing edges that are correctly removed.
  • ...and 1 more figures

Theorems & Definitions (38)

  • Theorem 1
  • Definition 1
  • Lemma 1
  • proof
  • Lemma 2
  • Lemma 3: Rank Test
  • Definition 2
  • Lemma 4
  • proof
  • Definition 3
  • ...and 28 more