Evaluating and Learning Optimal Dynamic Treatment Regimes under Truncation by Death

Sihyung Park; Wenbin Lu; Shu Yang

Evaluating and Learning Optimal Dynamic Treatment Regimes under Truncation by Death

Sihyung Park, Wenbin Lu, Shu Yang

TL;DR

This paper tackles evaluating and learning dynamic treatment regimes when truncation by death makes outcomes ill-defined, by adopting a principal-stratification target—the always-survivor value function $V_{\text{AS}}(\pi)$. It derives the efficient influence function and semiparametric efficiency bound for multi-stage settings, and proposes a multiply robust estimator that remains consistent under several nuisance-model misspecifications, with an off-policy evaluation and learning framework, including cross-fitting. The authors demonstrate robustness and efficiency through simulations and apply the method to MIMIC-III sepsis data, showing improved policy performance and interpretable determinants like age and weight in treatment decisions. Collectively, the work enables reliable, personalized decision-making in critical care settings where death truncates follow-up, and lays groundwork for extending to more general multi-decision-point regimes and time-to-death analyses.

Abstract

Truncation by death, a prevalent challenge in critical care, renders traditional dynamic treatment regime (DTR) evaluation inapplicable due to ill-defined potential outcomes. We introduce a principal stratification-based method, focusing on the always-survivor value function. We derive a semiparametrically efficient, multiply robust estimator for multi-stage DTRs, demonstrating its robustness and efficiency. Empirical validation and an application to electronic health records showcase its utility for personalized treatment optimization.

Evaluating and Learning Optimal Dynamic Treatment Regimes under Truncation by Death

TL;DR

. It derives the efficient influence function and semiparametric efficiency bound for multi-stage settings, and proposes a multiply robust estimator that remains consistent under several nuisance-model misspecifications, with an off-policy evaluation and learning framework, including cross-fitting. The authors demonstrate robustness and efficiency through simulations and apply the method to MIMIC-III sepsis data, showing improved policy performance and interpretable determinants like age and weight in treatment decisions. Collectively, the work enables reliable, personalized decision-making in critical care settings where death truncates follow-up, and lays groundwork for extending to more general multi-decision-point regimes and time-to-death analyses.

Evaluating and Learning Optimal Dynamic Treatment Regimes under Truncation by Death

TL;DR

Abstract

Evaluating and Learning Optimal Dynamic Treatment Regimes under Truncation by Death

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (14)