Interpretable Prediction and Feature Selection for Survival Analysis
Mike Van Ness, Madeleine Udell
TL;DR
DyS addresses the need for interpretable survival analysis in large, high-dimensional datasets. It combines a generalized additive model with interactions and neural shape functions in a neural additive model framework, trained with a discrete-time Ranked Probability Score loss to optimize survival predictions directly. It integrates feature selection via smooth-step gates and supports a two-stage fitting procedure to scale to large datasets while preserving interpretability through time-specific feature importances and impact plots. Across synthetic data, benchmark datasets, and a large heart-failure cohort, DyS achieves competitive discrimination with intrinsic interpretability, enabling both prediction and feature selection in a glass-box survival model.
Abstract
Survival analysis is widely used as a technique to model time-to-event data when some data is censored, particularly in healthcare for predicting future patient risk. In such settings, survival models must be both accurate and interpretable so that users (such as doctors) can trust the model and understand model predictions. While most literature focuses on discrimination, interpretability is equally as important. A successful interpretable model should be able to describe how changing each feature impacts the outcome, and should only use a small number of features. In this paper, we present DyS (pronounced ``dice''), a new survival analysis model that achieves both strong discrimination and interpretability. DyS is a feature-sparse Generalized Additive Model, combining feature selection and interpretable prediction into one model. While DyS works well for all survival analysis problems, it is particularly useful for large (in $n$ and $p$) survival datasets such as those commonly found in observational healthcare studies. Empirical studies show that DyS competes with other state-of-the-art machine learning models for survival analysis, while being highly interpretable.
