Fairness in Survival Analysis with Distributionally Robust Optimization
Shu Hu, George H. Chen
TL;DR
The paper develops a distributionally robust optimization (DRO) framework to promote fairness in survival analysis by minimizing worst-case subpopulation risk for all groups that occur with probability at least $\alpha$. It presents a general method to convert common survival models (Cox, DeepHit, SODEN) into DRO variants, introducing a sample-splitting technique to handle non-decomposable losses and proving a finite-sample convergence guarantee; an exact DRO formulation for Cox is also derived. The empirical results on FLC, SUPPORT, and SEER demonstrate that DRO variants can improve fairness metrics such as CI and censoring-based fairness measures with modest trade-offs in accuracy, compared to existing fairness regularizers. The work extends to competing risks, provides theoretical guarantees, and discusses practical considerations for hyperparameter tuning, splits, and evaluation metrics. Overall, it offers a flexible, model-agnostic DRO approach to enforce fairness in time-to-event predictions without requiring explicit sensitive-attribute specification, enabling safer deployment of survival models in high-stakes decisions.
Abstract
We propose a general approach for encouraging fairness in survival analysis models based on minimizing a worst-case error across all subpopulations that occur with at least a user-specified probability. This approach can be used to convert many existing survival analysis models into ones that simultaneously encourage fairness, without requiring the user to specify which attributes or features to treat as sensitive in the training loss function. From a technical standpoint, our approach applies recent developments of distributionally robust optimization (DRO) to survival analysis. The complication is that existing DRO theory uses a training loss function that decomposes across contributions of individual data points, i.e., any term that shows up in the loss function depends only on a single training point. This decomposition does not hold for commonly used survival loss functions, including for the Cox proportional hazards model, its deep neural network variants, and many other recently developed models that use loss functions involving ranking or similarity score calculations. We address this technical hurdle using a sample splitting strategy. We demonstrate our sample splitting DRO approach by using it to create fair versions of a diverse set of existing survival analysis models including the Cox model (and its deep variant DeepSurv), the discrete-time model DeepHit, and the neural ODE model SODEN. We also establish a finite-sample theoretical guarantee to show what our sample splitting DRO loss converges to. For the Cox model, we further derive an exact DRO approach that does not use sample splitting. For all the models that we convert into DRO variants, we show that the DRO variants often score better on recently established fairness metrics (without incurring a significant drop in accuracy) compared to existing survival analysis fairness regularization techniques.
