GRAFT: Decoupling Ranking and Calibration for Survival Analysis

Mohammad Ashhad; Robert Hoehndorf; Ricardo Henao

GRAFT: Decoupling Ranking and Calibration for Survival Analysis

Mohammad Ashhad, Robert Hoehndorf, Ricardo Henao

TL;DR

GRAFT (Gated Residual Accelerated Failure Time), a novel AFT model that decouples prognostic ranking from calibration, combines a linear AFT model with a non-linear residual neural network, and it also integrates stochastic gates for automatic, end-to-end feature selection.

Abstract

Survival analysis is complicated by censored data, high-dimensional features, and non-linear interactions. Classical models are interpretable but restrictive, while deep learning models are flexible but often non-interpretable and sensitive to noise. We propose GRAFT (Gated Residual Accelerated Failure Time), a novel AFT model that decouples prognostic ranking from calibration. GRAFT's hybrid architecture combines a linear AFT model with a non-linear residual neural network, and it also integrates stochastic gates for automatic, end-to-end feature selection. The model is trained by directly optimizing a differentiable, C-index-aligned ranking loss using stochastic conditional imputation from local Kaplan-Meier estimators. In public benchmarks, GRAFT outperforms baselines in discrimination and calibration, while remaining robust and sparse in high-noise settings.

GRAFT: Decoupling Ranking and Calibration for Survival Analysis

TL;DR

Abstract

Paper Structure (23 sections, 13 equations, 4 figures, 4 tables)

This paper contains 23 sections, 13 equations, 4 figures, 4 tables.

Introduction
Related Work
The Gated Residual AFT Model
Problem Formulation
GRAFT Model Overview
Stochastic Conditional Imputation via Local KM
Feature Selection with Stochastic Gates
Training Objective: Differentiable Ranking Loss
Survival Function Estimation
Experiments
Experimental Setup
Ablation Study
Noise Robustness Analysis
Results and Discussion
Baseline Comparison
...and 8 more sections

Figures (4)

Figure 1: Ablation study showing C-index degradation under increasing Gaussian noise for three GRAFT variants. We consider three model variants: the complete model (Full GRAFT), that without the stochastic gates (No STG) and that without the residual network (Linear Only). Error bars represent standard deviation across 3 random seeds.
Figure 2: Comparison of C-index for all six models under heavy-tailed Student's $t$ noise ($df=2$). Error bars represent standard deviation across 3 random seeds.
Figure 3: Ablation study showing IBS degradation under increasing Gaussian noise for three GRAFT variants across six datasets. "Full GRAFT" (red) maintains stable performance across noise levels. Error bars represent standard deviation across 3 random seeds.
Figure 4: Noise robustness comparison showing IBS under increasing Student's t noise ($df=2$) for all six models across six datasets. GRAFT (red) maintains the flattest performance curves, demonstrating superior robustness to heavy-tailed noise. Error bars represent standard deviation across 3 random seeds.

GRAFT: Decoupling Ranking and Calibration for Survival Analysis

TL;DR

Abstract

GRAFT: Decoupling Ranking and Calibration for Survival Analysis

Authors

TL;DR

Abstract

Table of Contents

Figures (4)