On Fisher Consistency of Surrogate Losses for Optimal Dynamic Treatment Regimes with Multiple Categorical Treatments per Stage
Nilanjana Laha, Nilson Chapagain, Victoria Cicherski, Aaron Sonabend-W
TL;DR
The paper investigates Fisher consistency for surrogate losses used to learn optimal dynamic treatment regimes (DTRs) across multiple stages and treatment levels per stage. It first shows that many concave surrogates, including broad PERM families, are not Fisher consistent in the multi-stage setting, motivating a move beyond concave surrogates. It then establishes necessary and sufficient conditions for Fisher consistency within nonnegative, stagewise-separable surrogates, and constructs smooth, non-concave surrogates (kernel-based and product-based) that are Fisher consistent. Building on these surrogates, the authors introduce Simultaneous Direct Search with Surrogates (SDSS), a gradient-based optimization method for learning DTRs across all stages, along with regret decay results under small-noise and smoothness assumptions. Empirical evaluation via simulations and a sepsis EHR study demonstrates SDSS’s potential advantages over stagewise methods like Q-learning and ACWL, particularly in settings with high misspecification risk or high-dimensional noisy covariates. These findings advance understanding of surrogate design for multi-stage DTRs and offer a scalable framework for model-free, simultaneous optimization of stage-wise treatments.
Abstract
Patients with chronic diseases often receive treatments at multiple time points, or stages. Our goal is to learn the optimal dynamic treatment regime (DTR) from longitudinal patient data. When both the number of stages and the number of treatment levels per stage are arbitrary, estimating the optimal DTR reduces to a sequential, weighted, multiclass classification problem (Kosorok and Laber, 2019). In this paper, we aim to solve this classification problem simultaneously across all stages using Fisher consistent surrogate losses. Although computationally feasible Fisher consistent surrogates exist in special cases, e.g., the binary treatment setting, a unified theory of Fisher consistency remains largely unexplored. We establish necessary and sufficient conditions for DTR Fisher consistency within the class of non-negative, stagewise separable surrogate losses. To our knowledge, this is the first result in the DTR literature to provide necessary conditions for Fisher consistency within a non-trivial surrogate class. Furthermore, we show that many convex surrogate losses fail to be Fisher consistent for the DTR classification problem, and we formally establish this inconsistency for smooth, permutation equivariant, and relative-margin-based convex losses. Building on this, we propose SDSS (Simultaneous Direct Search with Surrogates), which uses smooth, non-concave surrogate losses to learn the optimal DTR. We develop a computationally efficient, gradient-based algorithm for SDSS. When the optimization error is small, we establish a sharp upper bound on SDSS's regret decay rate. We evaluate the numerical performance of SDSS through simulations and demonstrate its real-world applicability by estimating optimal fluid resuscitation strategies for severe septic patients using electronic health record data.
