Proximal Survival Analysis to Handle Dependent Right Censoring
Andrew Ying
TL;DR
This work introduces proximal survival analysis to address dependent right censoring by leveraging proxy covariates organized into three types. It builds event-inducing and censoring-inducing bridge processes to nonparametrically identify finite-dimensional survival parameters, yielding doubly robust estimators (PEE, PCE, PDRE) with consistency and asymptotic normality under mild completeness and positivity conditions. Through simulations and a SEER-Medicare application, the approach demonstrates robustness to imperfect covariate measurements and potential latent prognoses, offering a principled alternative when conditional independent censoring is questionable. The framework highlights the practical relevance of proximal causal inference in survival analysis and points to future work on nonparametric nuisance estimation and broader coarsening scenarios.
Abstract
Many epidemiological and clinical studies aim at analyzing a time-to-event endpoint. A common complication is right censoring. In some cases, it arises because subjects are still surviving after the study terminates or move out of the study area, in which case right censoring is typically treated as independent or non-informative. Such an assumption can be further relaxed to conditional independent censoring by leveraging possibly time-varying covariate information, if available, assuming censoring and failure time are independent among covariate strata. In yet other instances, events may be censored by other competing events like death and are associated with censoring possibly through prognoses. Realistically, measured covariates can rarely capture all such associations with certainty. For such dependent censoring, often covariate measurements are at best proxies of underlying prognoses. In this paper, we establish a nonparametric identification framework by formally admitting that conditional independent censoring may fail in practice and accounting for covariate measurements as imperfect proxies of underlying association. The framework suggests adaptive estimators which we give generic assumptions under which they are consistent, asymptotically normal, and doubly robust. We illustrate our framework with concrete settings, where we examine the finite-sample performance of our proposed estimators via a Monte-Carlo simulation and apply them to the SEER-Medicare dataset.
