Table of Contents
Fetching ...

Causal Discovery in Linear Models with Unobserved Variables and Measurement Error

Yuqin Yang, Mohamed Nafea, Negar Kiyavash, Kun Zhang, AmirEmad Ghassami

TL;DR

This work tackles causal discovery in linear systems with both unobserved confounders and measurement error. It defines a canonical linear LV-SEM-ME and introduces separability plus two faithfulness notions, proving identifiability up to AOG or DOG equivalence classes. The authors provide practical recovery algorithms that first infer group structure from the mixing matrix $ oldsymbol{W}^*$ and then enumerate all compatible models within AOG and DOG by center/noise reassignment, under a minimality assumption. The results offer a principled path to recover causal structure in challenging settings and motivate future extensions to nonlinearity and robust mixing-matrix estimation.

Abstract

The presence of unobserved common causes and the presence of measurement error are two of the most limiting challenges in the task of causal structure learning. Ignoring either of the two challenges can lead to detecting spurious causal links among variables of interest. In this paper, we study the problem of causal discovery in systems where these two challenges can be present simultaneously. We consider linear models which include four types of variables: variables that are directly observed, variables that are not directly observed but are measured with error, the corresponding measurements, and variables that are neither observed nor measured. We characterize the extent of identifiability of such model under separability condition (i.e., the matrix indicating the independent exogenous noise terms pertaining to the observed variables is identifiable) together with two versions of faithfulness assumptions and propose a notion of observational equivalence. We provide graphical characterization of the models that are equivalent and present a recovery algorithm that could return models equivalent to the ground truth.

Causal Discovery in Linear Models with Unobserved Variables and Measurement Error

TL;DR

This work tackles causal discovery in linear systems with both unobserved confounders and measurement error. It defines a canonical linear LV-SEM-ME and introduces separability plus two faithfulness notions, proving identifiability up to AOG or DOG equivalence classes. The authors provide practical recovery algorithms that first infer group structure from the mixing matrix and then enumerate all compatible models within AOG and DOG by center/noise reassignment, under a minimality assumption. The results offer a principled path to recover causal structure in challenging settings and motivate future extensions to nonlinearity and robust mixing-matrix estimation.

Abstract

The presence of unobserved common causes and the presence of measurement error are two of the most limiting challenges in the task of causal structure learning. Ignoring either of the two challenges can lead to detecting spurious causal links among variables of interest. In this paper, we study the problem of causal discovery in systems where these two challenges can be present simultaneously. We consider linear models which include four types of variables: variables that are directly observed, variables that are not directly observed but are measured with error, the corresponding measurements, and variables that are neither observed nor measured. We characterize the extent of identifiability of such model under separability condition (i.e., the matrix indicating the independent exogenous noise terms pertaining to the observed variables is identifiable) together with two versions of faithfulness assumptions and propose a notion of observational equivalence. We provide graphical characterization of the models that are equivalent and present a recovery algorithm that could return models equivalent to the ground truth.
Paper Structure (41 sections, 11 theorems, 20 equations, 1 figure, 2 algorithms)

This paper contains 41 sections, 11 theorems, 20 equations, 1 figure, 2 algorithms.

Key Result

Theorem 1

Under Assumptions assumption:separability and assumption:conv_Faithfulness, the linear SEM-ME (resp. LV-SEM) can be identified up to its AOG equivalence class.

Figures (1)

  • Figure 1: Left: Diagram of the model in Example 1. Black circle represent variables that are not observed (in the non-caonical form). Right: Diagram of the canonical model. Double circle represent mleaf variables in $\mathcal{Z}^L$, and blue circle represent unobserved variables in $\mathcal{H}$.

Theorems & Definitions (22)

  • Definition 1: General linear LV-SEM-ME
  • Definition 2: Canonical linear LV-SEM-ME
  • Example 1
  • Example 1: Continued
  • Remark 1
  • Definition 3: Ancestral ordered grouping (AOG)
  • Definition 4: AOG equivalence class
  • Theorem 1
  • Definition 5: Direct ordered grouping (DOG)
  • Definition 6: DOG equivalence class
  • ...and 12 more