Table of Contents
Fetching ...

Transferring Causal Effects using Proxies

Manuel Iglesias-Alonso, Felix Schur, Julius von Kügelgen, Jonas Peters

TL;DR

This work addresses estimating the causal effect of a treatment $X$ on an outcome $Y$ when an unobserved confounder $U$ induces domain shifts and only a proxy $W$ of $U$ is observed in the target domain. By leveraging data from multiple source domains and assuming sufficient proxy informativeness (rank/invertibility of $P(W|E,x)$), the target interventional distribution $q(y|do(x))$ becomes identifiable even with discrete or continuous $U$. The authors propose two estimators with consistency guarantees, one achieving asymptotic normality and valid confidence intervals, and validate them through simulations and a hotel-ranking application. The results demonstrate practical viability for transferring causal effects across domains using proxies, with theoretical identifiability conditions and robust inference procedures. This framework advances causal domain adaptation by enabling estimation of interventions in unseen settings where only proxy measurements are available.

Abstract

We consider the problem of estimating a causal effect in a multi-domain setting. The causal effect of interest is confounded by an unobserved confounder and can change between the different domains. We assume that we have access to a proxy of the hidden confounder and that all variables are discrete or categorical. We propose methodology to estimate the causal effect in the target domain, where we assume to observe only the proxy variable. Under these conditions, we prove identifiability (even when treatment and response variables are continuous). We introduce two estimation techniques, prove consistency, and derive confidence intervals. The theoretical results are supported by simulation studies and a real-world example studying the causal effect of website rankings on consumer choices.

Transferring Causal Effects using Proxies

TL;DR

This work addresses estimating the causal effect of a treatment on an outcome when an unobserved confounder induces domain shifts and only a proxy of is observed in the target domain. By leveraging data from multiple source domains and assuming sufficient proxy informativeness (rank/invertibility of ), the target interventional distribution becomes identifiable even with discrete or continuous . The authors propose two estimators with consistency guarantees, one achieving asymptotic normality and valid confidence intervals, and validate them through simulations and a hotel-ranking application. The results demonstrate practical viability for transferring causal effects across domains using proxies, with theoretical identifiability conditions and robust inference procedures. This framework advances causal domain adaptation by enabling estimation of interventions in unseen settings where only proxy measurements are available.

Abstract

We consider the problem of estimating a causal effect in a multi-domain setting. The causal effect of interest is confounded by an unobserved confounder and can change between the different domains. We assume that we have access to a proxy of the hidden confounder and that all variables are discrete or categorical. We propose methodology to estimate the causal effect in the target domain, where we assume to observe only the proxy variable. Under these conditions, we prove identifiability (even when treatment and response variables are continuous). We introduce two estimation techniques, prove consistency, and derive confidence intervals. The theoretical results are supported by simulation studies and a real-world example studying the causal effect of website rankings on consumer choices.

Paper Structure

This paper contains 52 sections, 12 theorems, 90 equations, 14 figures, 1 table.

Key Result

Theorem 1

Under the data generating process described in ch:ProblemDescription (ignoring the paragraph "Generalization") and assuming as:RankAssumption, we have for all ${x \in \mathrm{supp}(X)}$ and $y \in \mathrm{supp}(Y)$: Therefore, if $(E,W,X,Y)$ is observed in the source domains and $W$ in the target domain, the causal effect of $X$ on $Y$ in the target domain $e_T$ is identifiable.

Figures (14)

  • Figure 1: Causal effect estimation in unseen domains using proxies. We seek to estimate the causal effect of a treatment $X$ on an outcome $Y$ in the presence of an unobserved confounder $U$. The main learning signal takes the form of proxy measurements $W$ of $U$. Moreover, we observe data from multiple domains or environments $E$, which differ through shifts in the distribution of $U$. (a) In the available source domains, we observe $E$, $X$, $Y$, and $W$. (b) In the target domain ($E=e_T$) for which we aim to estimate the causal effect, only $W$ is observed. We prove that the available data can suffice to identify the target interventional distribution $\mathbb{Q}_Y^{\mathrm{do}(X:=x)} := \mathbb{P}_{Y}^{\mathrm{do}(X:=x, E:=e_T)}$.
  • Figure 2: Examples of causal graphs satisfying our identifiability conditions. Whereas the paper mostly focuses on the scenario from \ref{['fig:CausalGraph1']}, our identifiability results hold for a broader class of causal models including additional covariates $Z$ and unmediated covariate shift via $E\to X$, see \ref{['app:extensions']}.
  • Figure 3: Absolute estimation error increases for near non-invertible $P(W|E,x)$. For $M{=}25$ distinct data generating processes, we draw $N{=}5$ datasets each, consisting of $k_E{=}2$ source domains and $n{=}20\,000$ realizations. The errors for both estimators increase with the condition number $\kappa$ of the matrix $P(W|E,x)$ (or its estimated counterpart), a measure of its non-invertibility.
  • Figure 4: Comparison of the different estimation procedures. We use the parameters $k_E=3$, $n=20\,000$, $M=10$, and $N=5$. The oracle presents the lowest error due to the use of the data from the intervention distribution. The reduced and causal parametrisation estimators show a similar distribution of the error, with their medians close to zero. Their performance is better than that of other baseline methods, whose absolute estimation error distribution is shifted towards larger values. The estimators that use the distribution of $(X,Y)$ in the target domain (with a $^*$) obtain better results in terms of the absolute estimation error than their pooled counterparts.
  • Figure 5: Reduced parametrisation estimator compared with four baselines and the ground truth for a real dataset. We compare the estimates of the causal effect $q(Y=1|\mathrm{do}(X=1))$ using the reduced estimator and the no-adjustment baselines with the oracle confidence intervals using 25 source and 18 target domains. For both of them, there is overlap between their confidence intervals and the ones from the oracle in all the target domains. The reduced estimator yields estimates closer to the oracle in more target domains and slightly smaller confidence intervals than NoAdj*.
  • ...and 9 more figures

Theorems & Definitions (22)

  • Theorem 1
  • Remark 2
  • Proposition 2
  • Proposition 2
  • Proposition 2
  • Theorem 2
  • proof
  • Proposition 2
  • proof
  • Proposition 2
  • ...and 12 more