Table of Contents
Fetching ...

DeepBlip: Estimating Conditional Average Treatment Effects Over Time

Haorui Ma, Dennis Frauen, Stefan Feuerriegel

TL;DR

DeepBlip introduces the first neural Structural Nested Mean Model (SNMM) framework to estimate Conditional Average Treatment Effects over time (CATE_t) by learning time-local blip functions with a neural backbone. It overcomes the sequential g-estimation bottleneck via a novel double optimization trick, enabling end-to-end gradient training of all blip components while preserving Neyman-orthogonality for robustness to nuisance function misspecification. The architecture operates in two stages—nuisance networks for residuals and a blip predictor for coefficients—using sequential encoders (transformers or LSTMs) to capture temporal dependencies, and supports efficient offline evaluation of optimal treatment sequences. Empirical results on tumor growth and MIMIC-III-like datasets show state-of-the-art CATE accuracy and favorable scalability to long horizons, underscoring its potential for personalized medicine in handling time-varying confounding and long-term treatment planning.

Abstract

Structural nested mean models (SNMMs) are a principled approach to estimate the treatment effects over time. A particular strength of SNMMs is to break the joint effect of treatment sequences over time into localized, time-specific ``blip effects''. This decomposition promotes interpretability through the incremental effects and enables the efficient offline evaluation of optimal treatment policies without re-computation. However, neural frameworks for SNMMs are lacking, as their inherently sequential g-estimation scheme prevents end-to-end, gradient-based training. Here, we propose DeepBlip, the first neural framework for SNMMs, which overcomes this limitation with a novel double optimization trick to enable simultaneous learning of all blip functions. Our DeepBlip seamlessly integrates sequential neural networks like LSTMs or transformers to capture complex temporal dependencies. By design, our method correctly adjusts for time-varying confounding to produce unbiased estimates, and its Neyman-orthogonal loss function ensures robustness to nuisance model misspecification. Finally, we evaluate our DeepBlip across various clinical datasets, where it achieves state-of-the-art performance.

DeepBlip: Estimating Conditional Average Treatment Effects Over Time

TL;DR

DeepBlip introduces the first neural Structural Nested Mean Model (SNMM) framework to estimate Conditional Average Treatment Effects over time (CATE_t) by learning time-local blip functions with a neural backbone. It overcomes the sequential g-estimation bottleneck via a novel double optimization trick, enabling end-to-end gradient training of all blip components while preserving Neyman-orthogonality for robustness to nuisance function misspecification. The architecture operates in two stages—nuisance networks for residuals and a blip predictor for coefficients—using sequential encoders (transformers or LSTMs) to capture temporal dependencies, and supports efficient offline evaluation of optimal treatment sequences. Empirical results on tumor growth and MIMIC-III-like datasets show state-of-the-art CATE accuracy and favorable scalability to long horizons, underscoring its potential for personalized medicine in handling time-varying confounding and long-term treatment planning.

Abstract

Structural nested mean models (SNMMs) are a principled approach to estimate the treatment effects over time. A particular strength of SNMMs is to break the joint effect of treatment sequences over time into localized, time-specific ``blip effects''. This decomposition promotes interpretability through the incremental effects and enables the efficient offline evaluation of optimal treatment policies without re-computation. However, neural frameworks for SNMMs are lacking, as their inherently sequential g-estimation scheme prevents end-to-end, gradient-based training. Here, we propose DeepBlip, the first neural framework for SNMMs, which overcomes this limitation with a novel double optimization trick to enable simultaneous learning of all blip functions. Our DeepBlip seamlessly integrates sequential neural networks like LSTMs or transformers to capture complex temporal dependencies. By design, our method correctly adjusts for time-varying confounding to produce unbiased estimates, and its Neyman-orthogonal loss function ensures robustness to nuisance model misspecification. Finally, we evaluate our DeepBlip across various clinical datasets, where it achieves state-of-the-art performance.

Paper Structure

This paper contains 51 sections, 3 theorems, 87 equations, 11 figures, 7 tables, 2 algorithms.

Key Result

Theorem 1

Adjustment via blip functions (Theorem 3.1 from Robins.2004) Given a policy $d$ between time $t$ and $t+\tau$, the following identification holds under the sequential ignorability assumption in Eq. eq:sequential-ignorability: with

Figures (11)

  • Figure 1: CATE over time. Trajectories of potential outcomes under two interventional sequences $a^*_{t:t+\tau},b^*_{t:t+\tau}$ given the shared observed history $H_t$. The difference between the two curves is the CATE over time.
  • Figure 2: Neural architecture of the two-stage DeepBlip framework.
  • Figure 2: MIMIC-III with longer time horizons $\tau$. Normalized RMSE (mean $\pm$ std. dev. over 5 runs) for $\tau$-step-ahead CATE estimation on the MIMIC-III dataset. We highlight the relative improvement over the best-performing baseline.
  • Figure 3: Results for tumor growth dataset. Normalized RMSE (averaged over 5 runs) of CATE predictions against ground-truth over growing confounding. Here: $\tau=2$
  • Figure 4: Scalability of Dynamic DML vs. DeepBlip. Shown is the training time.
  • ...and 6 more figures

Theorems & Definitions (8)

  • Theorem 1
  • Remark 1
  • Remark 2
  • Theorem 2
  • proof
  • Lemma 1
  • proof
  • proof