Table of Contents
Fetching ...

MissNODAG: Differentiable Cyclic Causal Graph Learning from Incomplete Data

Muralikrishnna G. Sethuraman, Razieh Nabi, Faramarz Fekri

TL;DR

This work proposes MissNODAG, a differentiable framework for learning both the underlying cyclic causal graph and the missingness mechanism from partially observed data, including data missing not at random.

Abstract

Causal discovery in real-world systems, such as biological networks, is often complicated by feedback loops and incomplete data. Standard algorithms, which assume acyclic structures or fully observed data, struggle with these challenges. To address this gap, we propose MissNODAG, a differentiable framework for learning both the underlying cyclic causal graph and the missingness mechanism from partially observed data, including data missing not at random. Our framework integrates an additive noise model with an expectation-maximization procedure, alternating between imputing missing values and optimizing the observed data likelihood, to uncover both the cyclic structures and the missingness mechanism. We demonstrate the effectiveness of MissNODAG through synthetic experiments and an application to real-world gene perturbation data.

MissNODAG: Differentiable Cyclic Causal Graph Learning from Incomplete Data

TL;DR

This work proposes MissNODAG, a differentiable framework for learning both the underlying cyclic causal graph and the missingness mechanism from partially observed data, including data missing not at random.

Abstract

Causal discovery in real-world systems, such as biological networks, is often complicated by feedback loops and incomplete data. Standard algorithms, which assume acyclic structures or fully observed data, struggle with these challenges. To address this gap, we propose MissNODAG, a differentiable framework for learning both the underlying cyclic causal graph and the missingness mechanism from partially observed data, including data missing not at random. Our framework integrates an additive noise model with an expectation-maximization procedure, alternating between imputing missing values and optimizing the observed data likelihood, to uncover both the cyclic structures and the missingness mechanism. We demonstrate the effectiveness of MissNODAG through synthetic experiments and an application to real-world gene perturbation data.

Paper Structure

This paper contains 22 sections, 38 equations, 11 figures, 2 tables, 1 algorithm.

Figures (11)

  • Figure 1: Example $m$-graphs with three variables illustrating: (a) An MNAR mechanism considered in our MissNODAG framework; (b) An MNAR mechanism where $R$s are connected and the full law is identifiable.
  • Figure 2: Comparison of results for learning causal graph structure (target law) under linear (left) and nonlinear (right) SEMS with MNAR missingness mechanism ($K=10$). Each missingness indicator has at most 3 parents.
  • Figure 3: Comparison of results for learning the missingness mechanism ($K = 10$).
  • Figure 4: Comparison of results for learning causal graph structure (target law) under linear SEMs with MAR missingness mechanism ($K=10$).
  • Figure 5: Results of target law recovery for linear SEM with varying training set sizes. The average missing probability was set to 0.2, and each $R_k$ has a parent set cardinality of 3.
  • ...and 6 more figures