Table of Contents
Fetching ...

Teaching Models To Survive: Proper Scoring Rule and Stochastic Optimization with Competing Risks

Julie Alberge, Vincent Maladière, Olivier Grisel, Judith Abécassis, Gaël Varoquaux

TL;DR

Compared to 11 state-of-the-art models, this model, MultiIncidence, performs best in estimating the probability of outcomes in survival and competing risks, and is much faster than existing alternatives.

Abstract

When data are right-censored, i.e. some outcomes are missing due to a limited period of observation, survival analysis can compute the "time to event". Multiple classes of outcomes lead to a classification variant: predicting the most likely event, known as competing risks, which has been less studied. To build a loss that estimates outcome probabilities for such settings, we introduce a strictly proper censoring-adjusted separable scoring rule that can be optimized on a subpart of the data because the evaluation is made independently of observations. It enables stochastic optimization for competing risks which we use to train gradient boosting trees. Compared to 11 state-of-the-art models, this model, MultiIncidence, performs best in estimating the probability of outcomes in survival and competing risks. It can predict at any time horizon and is much faster than existing alternatives.

Teaching Models To Survive: Proper Scoring Rule and Stochastic Optimization with Competing Risks

TL;DR

Compared to 11 state-of-the-art models, this model, MultiIncidence, performs best in estimating the probability of outcomes in survival and competing risks, and is much faster than existing alternatives.

Abstract

When data are right-censored, i.e. some outcomes are missing due to a limited period of observation, survival analysis can compute the "time to event". Multiple classes of outcomes lead to a classification variant: predicting the most likely event, known as competing risks, which has been less studied. To build a loss that estimates outcome probabilities for such settings, we introduce a strictly proper censoring-adjusted separable scoring rule that can be optimized on a subpart of the data because the evaluation is made independently of observations. It enables stochastic optimization for competing risks which we use to train gradient boosting trees. Compared to 11 state-of-the-art models, this model, MultiIncidence, performs best in estimating the probability of outcomes in survival and competing risks. It can predict at any time horizon and is much faster than existing alternatives.
Paper Structure (72 sections, 8 theorems, 46 equations, 15 figures, 8 tables, 1 algorithm)

This paper contains 72 sections, 8 theorems, 46 equations, 15 figures, 8 tables, 1 algorithm.

Key Result

lemma 5

Accounting for the time horizon $\zeta$, the expectation of the above scoring rule can be written as: $\quad\forall \zeta, (\mathbf{X}, T, \Delta) \sim \mathcal{D},$

Figures (15)

  • Figure 1: MultiIncidence Model with its Feedback Loop. After giving the input to the model, a random time is given and the weights and the target can be computed. After one iteration, the feedback loop trains the censoring probability -- $G^\star$ in eq.\ref{['eqn:full_loss']}.
  • Figure 2: Trade-off prediction/training time for competing risk on the synthetic dataset Average IBS compared to the fitting time for each model on 20k training data points, censoring rate around 50%, and a dependant censoring for 6 features.
  • Figure 3: Trade-off prediction/training time for competing risk on the SEER dataset Average IBS compared to the fitting time for each model on the maximum training points (330k) except for Fine & Gray (50k) and RSF (100k). Table \ref{['tab:ibs_event_seer']} gives IBS values for each event.
  • Figure 4: Prediction accuracy at time $\zeta$ Accuracy of the Argmax of the Cumulative Incidence Functions on different quantiles in time on the SEER Dataset (Higher is Better).
  • Figure 5: Trade-off prediction/training time in survival usage Performances (measured by IBS, integrated Brier score) function of fitting time for each model.
  • ...and 10 more figures

Theorems & Definitions (28)

  • definition 1: Quantities of interest
  • definition 3: Proper Scoring Rule
  • definition 4: PSR for competing risks settings
  • definition 5: Competitive Weights Negative LogLoss
  • lemma 5
  • proof : Proof sketch
  • theorem 6: Properness of the scoring rule
  • proof : Proof sketch
  • definition 7: Prediction accuracy at time $\zeta$
  • proof : Proof the of Lemma \ref{['lem:usefulllemma']} on the expectation of the Reweighted NLL
  • ...and 18 more