Table of Contents
Fetching ...

SurvBETA: Ensemble-Based Survival Models Using Beran Estimators and Several Attention Mechanisms

Lev V. Utkin, Semen P. Khomets, Vlada A. Efremenko, Andrei V. Konstantinov

TL;DR

SurvBETA tackles survival analysis with censored data by introducing an ensemble that uses the Beran estimator as a weak learner and aggregates conditional survival functions $S(t|\mathbf{x})$ through a three-level attention framework. The approach includes intra-Beran attention, prototype selection via Nadaraya–Watson, and a global aggregation mechanism, with a simplified special case based on the imprecise Hubers ε-contamination model to ease optimization. It demonstrates competitive performance against RSF, GBM-Cox, and GBM-AFT on synthetic and real datasets, and provides public code for reproducibility. The work offers a flexible, end-to-end trainable architecture that can accommodate different weak learners and kernel choices, supporting multimodal survival modeling and potential neural-kernel extensions.

Abstract

Many ensemble-based models have been proposed to solve machine learning problems in the survival analysis framework, including random survival forests, the gradient boosting machine with weak survival models, ensembles of the Cox models. To extend the set of models, a new ensemble-based model called SurvBETA (the Survival Beran estimator Ensemble using Three Attention mechanisms) is proposed where the Beran estimator is used as a weak learner in the ensemble. The Beran estimator can be regarded as a kernel regression model taking into account the relationship between instances. Outputs of weak learners in the form of conditional survival functions are aggregated with attention weights taking into account the distance between the analyzed instance and prototypes of all bootstrap samples. The attention mechanism is used three times: for implementation of the Beran estimators, for determining specific prototypes of bootstrap samples and for aggregating the weak model predictions. The proposed model is presented in two forms: in a general form requiring to solve a complex optimization problem for its training; in a simplified form by considering a special representation of the attention weights by means of the imprecise Huber's contamination model which leads to solving a simple optimization problem. Numerical experiments illustrate properties of the model on synthetic data and compare the model with other survival models on real data. A code implementing the proposed model is publicly available.

SurvBETA: Ensemble-Based Survival Models Using Beran Estimators and Several Attention Mechanisms

TL;DR

SurvBETA tackles survival analysis with censored data by introducing an ensemble that uses the Beran estimator as a weak learner and aggregates conditional survival functions through a three-level attention framework. The approach includes intra-Beran attention, prototype selection via Nadaraya–Watson, and a global aggregation mechanism, with a simplified special case based on the imprecise Hubers ε-contamination model to ease optimization. It demonstrates competitive performance against RSF, GBM-Cox, and GBM-AFT on synthetic and real datasets, and provides public code for reproducibility. The work offers a flexible, end-to-end trainable architecture that can accommodate different weak learners and kernel choices, supporting multimodal survival modeling and potential neural-kernel extensions.

Abstract

Many ensemble-based models have been proposed to solve machine learning problems in the survival analysis framework, including random survival forests, the gradient boosting machine with weak survival models, ensembles of the Cox models. To extend the set of models, a new ensemble-based model called SurvBETA (the Survival Beran estimator Ensemble using Three Attention mechanisms) is proposed where the Beran estimator is used as a weak learner in the ensemble. The Beran estimator can be regarded as a kernel regression model taking into account the relationship between instances. Outputs of weak learners in the form of conditional survival functions are aggregated with attention weights taking into account the distance between the analyzed instance and prototypes of all bootstrap samples. The attention mechanism is used three times: for implementation of the Beran estimators, for determining specific prototypes of bootstrap samples and for aggregating the weak model predictions. The proposed model is presented in two forms: in a general form requiring to solve a complex optimization problem for its training; in a simplified form by considering a special representation of the attention weights by means of the imprecise Huber's contamination model which leads to solving a simple optimization problem. Numerical experiments illustrate properties of the model on synthetic data and compare the model with other survival models on real data. A code implementing the proposed model is publicly available.

Paper Structure

This paper contains 18 sections, 50 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: A structure of attention mechanisms with the corresponding attention weights
  • Figure 2: Dependence of the C-index on the number of the Beran estimators
  • Figure 3: Dependence of the C-index on the number of points in each cluster
  • Figure 4: Dependence of the C-index on the distance between cluster
  • Figure 5: Dependence of the C-index on the parameter $k$
  • ...and 1 more figures