Moment Matters: Mean and Variance Causal Graph Discovery from Heteroscedastic Observational Data

Yoichi Chikahara

Moment Matters: Mean and Variance Causal Graph Discovery from Heteroscedastic Observational Data

Yoichi Chikahara

TL;DR

A Bayesian, moment-driven causal discovery framework that infers separate mean and variance graphs from observational heteroscedastic data is proposed and a variational inference method that learns a posterior distribution over both graphs is developed, enabling principled uncertainty quantification of structural features.

Abstract

Heteroscedasticity -- where the variance of a variable changes with other variables -- is pervasive in real data, and elucidating why it arises from the perspective of statistical moments is crucial in scientific knowledge discovery and decision-making. However, standard causal discovery does not reveal which causes act on the mean versus the variance, as it returns a single moment-agnostic graph, limiting interpretability and downstream intervention design. We propose a Bayesian, moment-driven causal discovery framework that infers separate \textit{mean} and \textit{variance} causal graphs from observational heteroscedastic data. We first derive the identification results by establishing sufficient conditions under which these two graphs are separately identifiable. Building on this theory, we develop a variational inference method that learns a posterior distribution over both graphs, enabling principled uncertainty quantification of structural features (e.g., edges, paths, and subgraphs). To address the challenges of parameter optimization in heteroscedastic models with two graph structures, we take a curvature-aware optimization approach and develop a prior incorporation technique that leverages domain knowledge on node orderings, improving sample efficiency. Experiments on synthetic, semi-synthetic, and real data show that our approach accurately recovers mean and variance structures and outperforms state-of-the-art baselines.

Moment Matters: Mean and Variance Causal Graph Discovery from Heteroscedastic Observational Data

TL;DR

Abstract

Paper Structure (47 sections, 7 theorems, 29 equations, 6 figures, 5 tables, 1 algorithm)

This paper contains 47 sections, 7 theorems, 29 equations, 6 figures, 5 tables, 1 algorithm.

Introduction
Background to Structural Causal Models
Towards Moment-Driven Causal Discovery
Mean-Variance HNM with Mean and Variance Causal Graphs
Identifiability Results
Proposed Method
Model Formulations
DAG Distribution Model
Likelihood Model
Model Parameter Learning Overview
Challenges in Complex Heteroscedasticity and Two-Graph Inference
Prior Knowledge Incorporation
Experiments
Simulated Data Experiments
Mean and Variance Graph Inference
...and 32 more sections

Key Result

Theorem 3.5

Under Assumptions asmp-csuff, asmp-cm, asmp-dag, and asmp-order, mean and variance causal graphs $G^{M}$ and $G^{V}$ are identifiable from observational distribution $\mathop{\mathrm{\mathrm{P}}}\nolimits(\textbf{X})$ if for $j = 1, \dots, d$, (A) $m_{j}$ is a nonlinear function, (B) $v_{j}$ is a pi

Figures (6)

Figure 1: Local protein regulatory network example represented by (a) Moment-agnostic causal graph and (b) mean and variance causal graphs (shown in red and blue edges)
Figure 2: Mean and variance causal graph inference performance on synthetic and SERGIO datasets with sample size $n=500$. Achieving both lower SHD rate (left) and higher F1 score (right) is better.
Figure 3: Mean and variance causal graph inference performance on synthetic datasets ($n=500$) under dense graphs
Figure 4: Moment-agnostic causal graph inference performance on non-Gaussian synthetic datasets ($n=500$) with (a) Laplace noise and (b) student-t noise. Achieving both lower SHD (left) and lower SID (right) is better.
Figure 5: Moment-agnostic causal graph inference performance on nonlinear Gaussian ANM datasets ($n=500$) with number of nodes $d = 5, 10, 20, 50$: $\downarrow$ and $\uparrow$ denote "lower is better" and "higher is better".
...and 1 more figures

Theorems & Definitions (12)

Example 1.1
Theorem 3.5
Remark 3.6: Role of Gaussianity
Remark 3.7: Non-constancy
Corollary A.1
Remark A.2
Theorem B.1: khemakhem2021causal
Theorem B.2: yin2024effective
Theorem B.3
Theorem B.4
...and 2 more

Moment Matters: Mean and Variance Causal Graph Discovery from Heteroscedastic Observational Data

TL;DR

Abstract

Moment Matters: Mean and Variance Causal Graph Discovery from Heteroscedastic Observational Data

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (12)