Causal Discovery on Dependent Binary Data

Alex Chen; Qing Zhou

Causal Discovery on Dependent Binary Data

Alex Chen, Qing Zhou

TL;DR

This work tackles causal discovery when observations are dependent binaries, introducing a latent utility model with unit-wise dependence captured by a covariance matrix $\Sigma$. It develops a pairwise maximum likelihood covariance estimator and an EM-like latent-data recovery with decorrelation via the Cholesky factor of $\Sigma^{-1}$, enabling standard DAG learning on decorrelated surrogates $Z$. The approach yields improved structure-learning accuracy over methods assuming independence, demonstrated on synthetic data and real scRNA-seq GRN tasks. The method is practical, scalable to $p>n$, and integrates with existing causal discovery tools to uncover underlying causal graphs in dependent datasets.

Abstract

The assumption of independence between observations (units) in a dataset is prevalent across various methodologies for learning causal graphical models. However, this assumption often finds itself in conflict with real-world data, posing challenges to accurate structure learning. We propose a decorrelation-based approach for causal graph learning on dependent binary data, where the local conditional distribution is defined by a latent utility model with dependent errors across units. We develop a pairwise maximum likelihood method to estimate the covariance matrix for the dependence among the units. Then, leveraging the estimated covariance matrix, we develop an EM-like iterative algorithm to generate and decorrelate samples of the latent utility variables, which serve as decorrelated data. Any standard causal discovery method can be applied on the decorrelated data to learn the underlying causal graph. We demonstrate that the proposed decorrelation approach significantly improves the accuracy in causal graph learning, through numerical experiments on both synthetic and real-world datasets.

Causal Discovery on Dependent Binary Data

TL;DR

This work tackles causal discovery when observations are dependent binaries, introducing a latent utility model with unit-wise dependence captured by a covariance matrix

. It develops a pairwise maximum likelihood covariance estimator and an EM-like latent-data recovery with decorrelation via the Cholesky factor of

, enabling standard DAG learning on decorrelated surrogates

. The approach yields improved structure-learning accuracy over methods assuming independence, demonstrated on synthetic data and real scRNA-seq GRN tasks. The method is practical, scalable to

, and integrates with existing causal discovery tools to uncover underlying causal graphs in dependent datasets.

Abstract

Paper Structure (22 sections, 1 theorem, 14 equations, 7 figures, 2 tables, 1 algorithm)

This paper contains 22 sections, 1 theorem, 14 equations, 7 figures, 2 tables, 1 algorithm.

Introduction
Motivation and contributions
Related works
A DAG model for dependent data
Methods
Covariance estimation
Latent Data Recovery and Decorrelation
Structure Learning
Experimental Results
Application on scRNA-Seq Data
Pre-estimate of Block Structure
Model Evaluation
Discussion
Summary
Limitations and future work
...and 7 more sections

Key Result

Lemma 2.1

Under the latent utility model in Equations eq:cont and eq:discrete, the DAG $\mathcal{G}(X)$ among the observed discrete variables $X$ is identical to the DAG $\mathcal{G}(Z)$ among the latent variables $Z$.

Figures (7)

Figure 1: Structure learning accuracy before and after decorrelation for 8 real Bayesian networks.
Figure 2: F-1 scores before and after decorrelation across 10 simulations for each setting of $(n,p)$ under deviations from our model assumptions.
Figure 3: Distribution of the test-data log-likelihood across the 10 folds in the RNA-seq data using three different methods.
Figure 4: Single simulation of finding the correlation over a pair of units where $n = 500$ and $p = 500$.
Figure 5: RMSE of estimated $\widehat{\Sigma}$. There are 10 different simulations done corresponding to each box-plot. Simulations used a mixed covariance structure under block sizes ranging from 10 to 15 under a random DAG setting.
...and 2 more figures

Theorems & Definitions (1)

Lemma 2.1

Causal Discovery on Dependent Binary Data

TL;DR

Abstract

Causal Discovery on Dependent Binary Data

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (1)