Table of Contents
Fetching ...

On Training Survival Models with Scoring Rules

Philipp Kopper, David Rügamer, Raphael Sonabend, Bernd Bischl, Andreas Bender

TL;DR

Empirical comparisons on synthetic and real-world data indicate that scoring rules can be successfully incorporated into model training and yield competitive predictive performance with established time-to-event models.

Abstract

Scoring rules are an established way of comparing predictive performances across model classes. In the context of survival analysis, they require adaptation in order to accommodate censoring. This work investigates using scoring rules for model training rather than evaluation. Doing so, we establish a general framework for training survival models that is model agnostic and can learn event time distributions parametrically or non-parametrically. In addition, our framework is not restricted to any specific scoring rule. While we focus on neural network-based implementations, we also provide proof-of-concept implementations using gradient boosting, generalized additive models, and trees. Empirical comparisons on synthetic and real-world data indicate that scoring rules can be successfully incorporated into model training and yield competitive predictive performance with established time-to-event models.

On Training Survival Models with Scoring Rules

TL;DR

Empirical comparisons on synthetic and real-world data indicate that scoring rules can be successfully incorporated into model training and yield competitive predictive performance with established time-to-event models.

Abstract

Scoring rules are an established way of comparing predictive performances across model classes. In the context of survival analysis, they require adaptation in order to accommodate censoring. This work investigates using scoring rules for model training rather than evaluation. Doing so, we establish a general framework for training survival models that is model agnostic and can learn event time distributions parametrically or non-parametrically. In addition, our framework is not restricted to any specific scoring rule. While we focus on neural network-based implementations, we also provide proof-of-concept implementations using gradient boosting, generalized additive models, and trees. Empirical comparisons on synthetic and real-world data indicate that scoring rules can be successfully incorporated into model training and yield competitive predictive performance with established time-to-event models.
Paper Structure (26 sections, 10 equations, 3 figures, 5 tables)

This paper contains 26 sections, 10 equations, 3 figures, 5 tables.

Figures (3)

  • Figure 1: Examples for architectures of our proposed method in the single risk case. Top: Parametric approach. We pass the data through a fully connected neural network to estimate the parameters (here $\theta_1$ and $\theta_2$) of a survival distribution. We generate predictions for each $t = \tau_j$ using the parameterized $F$. Bottom: Non-parametric approach. We pass the data through a fully-connected neural network to estimate the survival increments $\alpha_j$ and use them to generate survival predictions for each $\tau_j$, where $\xi(\cdot) := \gamma_2(-\sum(\cdot))$.
  • Figure 2: Selected predictions from the parametric and non-parametric framework for the metabric data set. All models are tuned. The variants relying on the PH assumption do not allow curves to cross. The step functions for the non-parametric framework have a varying roughness depending on the parametrization.
  • Figure 3: Results of the comparison to ML estimation. Left: difference of estimated parameter $\hat{\theta}$ to oracle parameters $\theta$. Parameter comparison for Cox PH models is limited to the coefficients $\beta_1, \beta_2$ and $\beta_3$, and the Weibull distribution. Right: Relative difference in the predictive performance w.r.t. the data generating process (DGP). Optimal performance is given by $\text{RISBS}_{\text{DGP}}$, obtained by using true parameters in the correctly specified model.