Doubly Robust and Efficient Calibration of Prediction Sets for Right-Censored Time-to-Event Outcomes

Rebecca Farina; Eric J. Tchetgen Tchetgen; Arun Kumar Kuchibhotla

Doubly Robust and Efficient Calibration of Prediction Sets for Right-Censored Time-to-Event Outcomes

Rebecca Farina, Eric J. Tchetgen Tchetgen, Arun Kumar Kuchibhotla

TL;DR

This work develops a flexible, calibration-based framework for predicting right-censored survival times without relying on correctly specified survival models. It introduces IPCW and augmented AIPCW methods to construct lower predictive bounds with asymptotic marginal (and, under a special score, conditional) PAC guarantees, and complements them with Calibrated Outcome Regression (COR) and Outcome Regression (OR) approaches. AIPCW offers a doubly robust guarantee, achieving improved efficiency by combining information from both the censoring and failure-time models via an efficient influence function. Through extensive simulations and a real RA cohort, the methods yield near-nominal coverage across diverse censoring regimes and model misspecifications, while providing practical, scalable procedures for uncertainty quantification in survival prediction sets.

Abstract

Our objective is to construct well-calibrated prediction sets for a time-to-event outcome subject to right-censoring with guaranteed coverage. Inspired by modern conformal inference, our approach avoids the need for a well-specified parametric or semiparametric survival model. Unlike existing conformal methods for survival data, which assume Type-I censoring with fully observed censoring times, we consider the more common right-censoring setting in which only the censoring time or only the event time is observed, whichever comes first. Under a standard conditional independence censoring condition, we propose and analyze several lower prediction bounds for the survival time of a future observation, including inverse-probability-of-censoring weighting, and its augmented version based on the semiparametric efficient influence function for the relevant marginal quantile of the outcome accounting for dependent censoring. We formally establish asymptotic coverage guarantees of the proposed methods, and demonstrate both theoretically and through empirical experiments, that the augmented approach substantially improves efficiency over all other proposed methods. Specifically, its coverage error bound is doubly robust, and therefore of second order, thus ensuring that it is asymptotically negligible relative to the coverage error of the other methods.

Doubly Robust and Efficient Calibration of Prediction Sets for Right-Censored Time-to-Event Outcomes

TL;DR

Abstract

Paper Structure (27 sections, 7 theorems, 86 equations, 9 figures, 2 tables, 4 algorithms)

This paper contains 27 sections, 7 theorems, 86 equations, 9 figures, 2 tables, 4 algorithms.

Introduction
Background
Problem setup
Related work
Our contributions
IPCW and AIPCW Methods
IPCW Method
AIPCW method
Calibrated Outcome Regression Method
Simulation studies
Simulation design and benchmark comparison methods
Synthetic data
Results
Real Data Application
Data description
...and 12 more sections

Key Result

Theorem 2.3

Let $\epsilon\in(0,1)$ be fixed. There exists a universal constant $K$ such that under Assumptions Assum:CIC and Assum:bound, with probability at least $1-\epsilon$ over $\mathcal{D}$ where the probability $\mathbb{P}$ is taken with respect to a new data point $(X,T)\sim P_{(X,T)}$. Furthermore, if the non-conformity score is chosen to satisfy eq:conditional-quantile-non-conformity-score, and if t

Figures (9)

Figure 1: Distribution of empirical marginal coverage rates across 100 simulated test datasets under Settings 1, 2, and 3 for each evaluated method. The red vertical line denotes the 90% target coverage.
Figure 2: Distribution of the average estimated LPB in the test set across 100 simulated datasets for each method under Settings 1, 2, and 3. The red vertical line at 0 marks the lower limit of the support of the time-to-event outcome.
Figure 3: Evaluation on the real-world dataset across 100 random data splits. The left and center panels display the distribution of empirical coverage using (A)IPCW-based coverage evaluation (left) and OR-based coverage evaluation (center), with red lines marking the 90% target coverage. The right panel displays the average LPB.
Figure F.1: Distribution of empirical marginal coverage rates across 100 simulated test datasets under Settings 1 for each evaluated method. The red vertical line denotes the 90% target coverage.
Figure F.2: Distribution of empirical marginal coverage rates across 100 simulated test datasets under Setting 2 for each evaluated method. The red vertical line denotes the 90% target coverage.
...and 4 more figures

Theorems & Definitions (23)

Definition 1.1
Definition 1.2
Theorem 2.3
Remark 2.4
Lemma 2.5
Theorem 2.6
Remark 2.7
Remark 2.8
Remark 2.9
Theorem 2.10
...and 13 more

Doubly Robust and Efficient Calibration of Prediction Sets for Right-Censored Time-to-Event Outcomes

TL;DR

Abstract

Doubly Robust and Efficient Calibration of Prediction Sets for Right-Censored Time-to-Event Outcomes

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (9)

Theorems & Definitions (23)