Table of Contents
Fetching ...

Causal survival embeddings: non-parametric counterfactual inference under censoring

Carlos García-Meixide, Marcos Matabuena

TL;DR

This paper presents a non-parametric, model-free framework for counterfactual survival analysis under right-censoring by embedding counterfactual distributions in reproducing kernel Hilbert spaces. The approach leverages kernel mean embeddings and integrated depth bands to adjust for confounding without requiring density smoothness, and it provides Hadamard-differentiable operators with convergence guarantees. Through simulations and an application to the SPRINT trial, the method demonstrates stable performance under censoring and offers a flexible tool for time-varying causal inference and hypothesis testing in observational studies. The work sits at the intersection of causal inference, survival analysis, and RKHS theory, offering a practical, extensible alternative to semi-parametric methods with potential for incorporating complex predictors and advanced testing procedures.

Abstract

Model-free time-to-event regression under confounding presents challenges due to biases introduced by causal and censoring sampling mechanisms. This phenomenology poses problems for classical non-parametric estimators like Beran's or the k-nearest neighbours algorithm. In this study, we propose a natural framework that leverages the structure of reproducing kernel Hilbert spaces (RKHS) and, specifically, the concept of kernel mean embedding to address these limitations. Our framework has the potential to enable statistical counterfactual modeling, including counterfactual prediction and hypothesis testing, under right-censoring schemes. Through simulations and an application to the SPRINT trial, we demonstrate the practical effectiveness of our method, yielding coherent results when compared to parallel analyses in existing literature. We also provide a theoretical analysis of our estimator through an RKHS-valued empirical process. Our approach offers a novel tool for performing counterfactual survival estimation in observational studies with incomplete information. It can also be complemented by state-of-the-art algorithms based on semi-parametric and parametric models.

Causal survival embeddings: non-parametric counterfactual inference under censoring

TL;DR

This paper presents a non-parametric, model-free framework for counterfactual survival analysis under right-censoring by embedding counterfactual distributions in reproducing kernel Hilbert spaces. The approach leverages kernel mean embeddings and integrated depth bands to adjust for confounding without requiring density smoothness, and it provides Hadamard-differentiable operators with convergence guarantees. Through simulations and an application to the SPRINT trial, the method demonstrates stable performance under censoring and offers a flexible tool for time-varying causal inference and hypothesis testing in observational studies. The work sits at the intersection of causal inference, survival analysis, and RKHS theory, offering a practical, extensible alternative to semi-parametric methods with potential for incorporating complex predictors and advanced testing procedures.

Abstract

Model-free time-to-event regression under confounding presents challenges due to biases introduced by causal and censoring sampling mechanisms. This phenomenology poses problems for classical non-parametric estimators like Beran's or the k-nearest neighbours algorithm. In this study, we propose a natural framework that leverages the structure of reproducing kernel Hilbert spaces (RKHS) and, specifically, the concept of kernel mean embedding to address these limitations. Our framework has the potential to enable statistical counterfactual modeling, including counterfactual prediction and hypothesis testing, under right-censoring schemes. Through simulations and an application to the SPRINT trial, we demonstrate the practical effectiveness of our method, yielding coherent results when compared to parallel analyses in existing literature. We also provide a theoretical analysis of our estimator through an RKHS-valued empirical process. Our approach offers a novel tool for performing counterfactual survival estimation in observational studies with incomplete information. It can also be complemented by state-of-the-art algorithms based on semi-parametric and parametric models.
Paper Structure (24 sections, 11 theorems, 55 equations, 8 figures)

This paper contains 24 sections, 11 theorems, 55 equations, 8 figures.

Key Result

Lemma 1

In general, ${S}_{T\langle 0 \mid 0\rangle}={S}_{\tilde{T}^0 \mid Z=0} \text{ and } {S}_{T\langle 1 \mid 1\rangle}={S}_{\tilde{T}^1 \mid Z=1}$. Moreover, if conditional exogeneity holds and support$(S_{X^1})=$ support $(S_{X^0})$ then we also have ${S}_{T\langle 0 \mid 1\rangle}={S}_{\tilde{T}^0 \mi

Figures (8)

  • Figure 1: $n=100$
  • Figure 2: $n=200$
  • Figure 3: $n=300$
  • Figure 4: $n=500$
  • Figure 6: DAG depicting underlying causal structure of the medical problem; taken from stensrudathero. The primary aim of their investigation was to decompose the total effect of intensive therapy versus standard therapy into two separate pathways: (i) a direct pathway that encompasses all effects not involving a reduction in diastolic blood pressure below 60 mmHg, comprising the advantageous impact of reducing systolic blood pressure, and (ii) an indirect pathway that acts through on-treatment DBP below 60 mmHg and has the potential to be deleterious.
  • ...and 3 more figures

Theorems & Definitions (30)

  • Definition 1: Counterfactual distributions, cherno13
  • Lemma 1: cherno13cme
  • proof
  • Definition 2: Conditional mean embedding, song2009hilbert
  • Definition 3: Counterfactual mean embedding, cme
  • Definition 4
  • Definition 5
  • Theorem 1
  • Theorem 2
  • Definition 6: Vector-valued RKHS, carmeli2006vector
  • ...and 20 more