Table of Contents
Fetching ...

Federated Causal Discovery From Interventions

Amin Abyaneh, Nino Scherrer, Patrick Schwab, Stefan Bauer, Bernhard Schölkopf, Arash Mehrjou

TL;DR

The paper tackles causal discovery under privacy constraints by enabling learning of a global DAG $G$ from distributed data that include interventional samples. It introduces FedCDI, a two-phase framework where each client learns a local belief over edges using a neural LCDM and the server aggregates these beliefs with a novel proximity-based method that accounts for which covariates were intervened. Empirical results on synthetic ER graphs and real-world bnlearn graphs show FedCDI achieving performance on par with centralized approaches and outperforming prior federated methods, especially under interventional data heterogeneity. The work demonstrates scalability to multiple clients, supports both horizontal and vertical data splits, and provides code for reproducibility, highlighting its practical impact for privacy-preserving causal structure learning in distributed environments.

Abstract

Causal discovery serves a pivotal role in mitigating model uncertainty through recovering the underlying causal mechanisms among variables. In many practical domains, such as healthcare, access to the data gathered by individual entities is limited, primarily for privacy and regulatory constraints. However, the majority of existing causal discovery methods require the data to be available in a centralized location. In response, researchers have introduced federated causal discovery. While previous federated methods consider distributed observational data, the integration of interventional data remains largely unexplored. We propose FedCDI, a federated framework for inferring causal structures from distributed data containing interventional samples. In line with the federated learning framework, FedCDI improves privacy by exchanging belief updates rather than raw samples. Additionally, it introduces a novel intervention-aware method for aggregating individual updates. We analyze scenarios with shared or disjoint intervened covariates, and mitigate the adverse effects of interventional data heterogeneity. The performance and scalability of FedCDI is rigorously tested across a variety of synthetic and real-world graphs.

Federated Causal Discovery From Interventions

TL;DR

The paper tackles causal discovery under privacy constraints by enabling learning of a global DAG from distributed data that include interventional samples. It introduces FedCDI, a two-phase framework where each client learns a local belief over edges using a neural LCDM and the server aggregates these beliefs with a novel proximity-based method that accounts for which covariates were intervened. Empirical results on synthetic ER graphs and real-world bnlearn graphs show FedCDI achieving performance on par with centralized approaches and outperforming prior federated methods, especially under interventional data heterogeneity. The work demonstrates scalability to multiple clients, supports both horizontal and vertical data splits, and provides code for reproducibility, highlighting its practical impact for privacy-preserving causal structure learning in distributed environments.

Abstract

Causal discovery serves a pivotal role in mitigating model uncertainty through recovering the underlying causal mechanisms among variables. In many practical domains, such as healthcare, access to the data gathered by individual entities is limited, primarily for privacy and regulatory constraints. However, the majority of existing causal discovery methods require the data to be available in a centralized location. In response, researchers have introduced federated causal discovery. While previous federated methods consider distributed observational data, the integration of interventional data remains largely unexplored. We propose FedCDI, a federated framework for inferring causal structures from distributed data containing interventional samples. In line with the federated learning framework, FedCDI improves privacy by exchanging belief updates rather than raw samples. Additionally, it introduces a novel intervention-aware method for aggregating individual updates. We analyze scenarios with shared or disjoint intervened covariates, and mitigate the adverse effects of interventional data heterogeneity. The performance and scalability of FedCDI is rigorously tested across a variety of synthetic and real-world graphs.
Paper Structure (28 sections, 1 theorem, 8 equations, 15 figures, 3 tables, 2 algorithms)

This paper contains 28 sections, 1 theorem, 8 equations, 15 figures, 3 tables, 2 algorithms.

Key Result

Proposition 4.2

Consider $X_i, X_j \in X$ to be two arbitrary variables in $G$. Then, for a client $C_k$ admitting an intervened variable, $X_s \in X^{k}_{\mathcal{I}}$, the reliability of $\psi^k_{ij}$, denoted by $r^k_{ij} \in [0, 1]$, is higher if $X_i \to X_j$ belongs to a path in $G$ descending from $X_s$.

Figures (15)

  • Figure 1: Overview of FedCDI, including the distributed dataset and local learning process. We only depict a single client, as the rest perform similar operations.
  • Figure 2: FedCDI is rigorously compared against a centralized approach and an isolated client, taking for various interventional data sizes. Within the centralized approach, the entire dataset is available, while in both FedCDI and the isolated client, each client is limited to only half of $D$ distributed horizontally.
  • Figure 3: We apply the proximity-based and naive aggregation methods to vertically distributed $D_{\mathcal{I}}$. Each curve corresponds to a 5-client setup, where clients have access to disjoint and covering subsets of vertically distributed interventional dataset, i.e., local datasets contain interventional samples on only 4 out of 20 dataset features.
  • Figure 4: Effect of additional clients participating in FedCDI for different $D_{\mathcal{I}}$ sizes. The discovered DAG is more accurate where $size(D_{\mathcal{I}})$ increases with the addition of new clients.
  • Figure 5: FedCDI manages to sustain the knowledge when the fixed-size $D_{\mathcal{I}}$ is further divided by the increase in number of clients, as opposed to \ref{['fig:client_sweep_nodiv']}.
  • ...and 10 more figures

Theorems & Definitions (3)

  • Definition 4.1
  • Proposition 4.2
  • Example 4.2.1