Table of Contents
Fetching ...

Nonparametric Heterogeneous Long-term Causal Effect Estimation via Data Combination

Weilin Chen, Ruichu Cai, Junjie Wan, Zeqin Yang, José Miguel Hernández-Lobato

TL;DR

This paper tackles the challenging problem of estimating heterogeneous long-term causal effects from fused short-term experimental and long-term observational data under the Conditional Additive Equi-Confounding Bias assumption. It introduces two-stage regression- and propensity-based HLCE estimators, and a novel multiple robust estimator that remains consistent when any one of several nuisance-function sets is correctly specified, with theoretical convergence guarantees and oracle-rate comparisons. A neural-network-based MR estimator with shared representations is developed to enhance practical performance, and extensive experiments on synthetic, semi-synthetic IHDP, and News datasets demonstrate improved HLCE accuracy and stability, especially in small-sample regimes. The work offers principled tools for personalized long-term decision-making and advances understanding of robustness and efficiency in HLCE estimation.

Abstract

Long-term causal inference has drawn increasing attention in many scientific domains. Existing methods mainly focus on estimating average long-term causal effects by combining long-term observational data and short-term experimental data. However, it is still understudied how to robustly and effectively estimate heterogeneous long-term causal effects, significantly limiting practical applications. In this paper, we propose several two-stage style nonparametric estimators for heterogeneous long-term causal effect estimation, including propensity-based, regression-based, and multiple robust estimators. We conduct a comprehensive theoretical analysis of their asymptotic properties under mild assumptions, with the ultimate goal of building a better understanding of the conditions under which some estimators can be expected to perform better. Extensive experiments across several semi-synthetic and real-world datasets validate the theoretical results and demonstrate the effectiveness of the proposed estimators.

Nonparametric Heterogeneous Long-term Causal Effect Estimation via Data Combination

TL;DR

This paper tackles the challenging problem of estimating heterogeneous long-term causal effects from fused short-term experimental and long-term observational data under the Conditional Additive Equi-Confounding Bias assumption. It introduces two-stage regression- and propensity-based HLCE estimators, and a novel multiple robust estimator that remains consistent when any one of several nuisance-function sets is correctly specified, with theoretical convergence guarantees and oracle-rate comparisons. A neural-network-based MR estimator with shared representations is developed to enhance practical performance, and extensive experiments on synthetic, semi-synthetic IHDP, and News datasets demonstrate improved HLCE accuracy and stability, especially in small-sample regimes. The work offers principled tools for personalized long-term decision-making and advances understanding of robustness and efficiency in HLCE estimation.

Abstract

Long-term causal inference has drawn increasing attention in many scientific domains. Existing methods mainly focus on estimating average long-term causal effects by combining long-term observational data and short-term experimental data. However, it is still understudied how to robustly and effectively estimate heterogeneous long-term causal effects, significantly limiting practical applications. In this paper, we propose several two-stage style nonparametric estimators for heterogeneous long-term causal effect estimation, including propensity-based, regression-based, and multiple robust estimators. We conduct a comprehensive theoretical analysis of their asymptotic properties under mild assumptions, with the ultimate goal of building a better understanding of the conditions under which some estimators can be expected to perform better. Extensive experiments across several semi-synthetic and real-world datasets validate the theoretical results and demonstrate the effectiveness of the proposed estimators.

Paper Structure

This paper contains 32 sections, 7 theorems, 69 equations, 6 figures, 2 tables.

Key Result

Theorem 1

Suppose Assumptions assum: consist, assum: positi, assum: internal validity of obs, assum: internal validity of exp, assum: external validity of exp and assum: equ bias hold, then $\tau(x)$ can be identified as follows:

Figures (6)

  • Figure 1: Causal graphs of experimental data and observational data with $X$ being covariates, $U$ being unobserved confounders, $A$ being treatment, $S$ being short-term outcome, and $Y$ being long-term outcome. Gray nodes denote unobserved variables and white nodes denote observed variables. Arrows denote causal relationships. Fig. \ref{['figure exp']} represents the causal graph of the short-term experimental data, where treatment $A$ is not affected by unobserved confounders $U$ and the long-term outcome $Y$ is unobserved. Fig. \ref{['figure obs']} represents the causal graph of the long-term observational data, where the unobserved confounders $U$ affect treatments $A$ and outcome $S,Y$ and the long-term outcome $Y$ can be observed..
  • Figure 2: Our neural network-based model architecture of the MR estimator $\hat{\tau}_{mr}(x)$. Pink blocks denote MLPs. White blocks denote inputs or outputs. Green blocks denote short-term nuisance functions. Blue blocks denote long-term nuisance functions. White circles denote switches. Both learning stages are implemented using neural networks. The top figure shows our first-stage learning, where we learn the shared representations across experimental and observational data, treated and control groups, as well as short- and long-term outcome predictions. The bottom left figure illustrates our second-stage learning, where we regress the pseudo outcome $\hat{Y}_{mr}$ on the covariates $X$.
  • Figure 3: Results on dataset 2. Fig.\ref{['fig: ite, fix o vary e']} reports the PEHE of the heterogeneous effect estimation with a fixed size of experimental data and a varying size of observational data. Fig.\ref{['fig: ate, fix o vary e']} reports the absolute error of the average effect estimation with a fixed size of experimental data and a varying size of observational data. Fig.\ref{['fig: ite, fix o vary e']} reports the PEHE of the heterogeneous effect estimation with a fixed size of observational data and a varying size of experimental data. Fig.\ref{['fig: ite, fix o vary e']} reports the absolute error of the average effect estimation with a fixed size of observational data and a varying size of experimental data.
  • Figure 4: Results on dataset 1: Model Misspecification Experiments. The bold numbers mean the corresponding set of nuisance functions are correctly specified, and the numbers with $\prime$ mean the corresponding set of nuisance functions in Lemma \ref{['lem: mr consis']} are misspecified, e.g., $\mathcal{M}_{\textbf{1},\textbf{2},\textbf{3},\textbf{4}}$ means all sets of nuisance functions are correctly misspecified, and $\mathcal{M}_{\textbf{1},{2^\prime},{3^\prime},{4^\prime}}$ means the first set of nuisance functions, i.e. $\{ \hat{\mu}^O_S(a,x), \hat{\mu}^E_S(a,x),\hat{\mu}^O_Y(a,x) \}$, is correctly specified and the rest of the sets are misspecified.
  • Figure 5: Our neural network-based model architecture of $\hat{\tau}_{reg}(x)$. Gray blocks denote MLPs.
  • ...and 1 more figures

Theorems & Definitions (16)

  • Theorem 1
  • Lemma 1: Baselines Consistency
  • Lemma 2: MR Estimator Consistency
  • Definition 1: Oracle rate
  • Definition 2: Hölder ball
  • Theorem 2: Convergence Rate
  • Theorem 3: MR Estimator Convergence Rate
  • Corollary 1: Baseline Estimators Convergence Rate
  • Theorem 4: Estimators Convergence Rate
  • proof
  • ...and 6 more