Table of Contents
Fetching ...

Disentangle Estimation of Causal Effects from Cross-Silo Data

Yuxuan Liu, Haozhao Wang, Shuang Wang, Zhiming He, Wenchao Xu, Jialiang Zhu, Fan Yang

TL;DR

The paper tackles cross-silo causal inference with heterogeneous feature spaces under privacy constraints. It introduces FedDCI, a disentangled architecture that uses shared and private branches and a global constraint to transfer causal information while keeping local data local. A KL-based encoder aligns latent representations across silos and a coordinated optimization strategy yields convergence guarantees under standard non-convex assumptions, with detailed analysis showing a sublinear convergence rate. Empirical results on semi-synthetic Twins and IHDP datasets demonstrate that FedDCI outperforms state-of-the-art baselines in non-IID cross-silo settings, highlighting its practical potential for private, multi-domain causal effect estimation.

Abstract

Estimating causal effects among different events is of great importance to critical fields such as drug development. Nevertheless, the data features associated with events may be distributed across various silos and remain private within respective parties, impeding direct information exchange between them. This, in turn, can result in biased estimations of local causal effects, which rely on the characteristics of only a subset of the covariates. To tackle this challenge, we introduce an innovative disentangle architecture designed to facilitate the seamless cross-silo transmission of model parameters, enriched with causal mechanisms, through a combination of shared and private branches. Besides, we introduce global constraints into the equation to effectively mitigate bias within the various missing domains, thereby elevating the accuracy of our causal effect estimation. Extensive experiments conducted on new semi-synthetic datasets show that our method outperforms state-of-the-art baselines.

Disentangle Estimation of Causal Effects from Cross-Silo Data

TL;DR

The paper tackles cross-silo causal inference with heterogeneous feature spaces under privacy constraints. It introduces FedDCI, a disentangled architecture that uses shared and private branches and a global constraint to transfer causal information while keeping local data local. A KL-based encoder aligns latent representations across silos and a coordinated optimization strategy yields convergence guarantees under standard non-convex assumptions, with detailed analysis showing a sublinear convergence rate. Empirical results on semi-synthetic Twins and IHDP datasets demonstrate that FedDCI outperforms state-of-the-art baselines in non-IID cross-silo settings, highlighting its practical potential for private, multi-domain causal effect estimation.

Abstract

Estimating causal effects among different events is of great importance to critical fields such as drug development. Nevertheless, the data features associated with events may be distributed across various silos and remain private within respective parties, impeding direct information exchange between them. This, in turn, can result in biased estimations of local causal effects, which rely on the characteristics of only a subset of the covariates. To tackle this challenge, we introduce an innovative disentangle architecture designed to facilitate the seamless cross-silo transmission of model parameters, enriched with causal mechanisms, through a combination of shared and private branches. Besides, we introduce global constraints into the equation to effectively mitigate bias within the various missing domains, thereby elevating the accuracy of our causal effect estimation. Extensive experiments conducted on new semi-synthetic datasets show that our method outperforms state-of-the-art baselines.
Paper Structure (12 sections, 2 theorems, 18 equations, 2 figures, 2 tables)

This paper contains 12 sections, 2 theorems, 18 equations, 2 figures, 2 tables.

Key Result

Theorem 1

Assuming the validity of assumptions 1, and given that $\Vert\nabla L_{\omega_k}(\omega^{s,k}_t)\Vert^2 \leq A^2$, $\Vert\nabla L_{\omega_k}(\omega^{p,k}_t)\Vert^2 \leq B^2$, and $\xi = \sqrt{\frac{2M}{\beta T(A+B)^2}}$, where $L_{\omega_k}(\omega_1) - L_{\omega_k}(\omega_T) \leq M$, we can demonstr Under these conditions, if both $\omega_t^{s,k}$ and $\omega^{p,k}_t$ are smooth, the process can a

Figures (2)

  • Figure 1: Illustration of our framework. In the local training phase, shared information is communicated to the private branch. During the aggregation stage, the shared model is uploaded to the server for model aggregation.
  • Figure 2: This experiment analysed the effect of $\alpha$ on PEHE and ATE metrics in the case of sample size noniid versus in the case of sample size, sample characteristics both non-iid.

Theorems & Definitions (2)

  • Theorem 1
  • Theorem 2