FPBoost: Fully Parametric Gradient Boosting for Survival Analysis
Alberto Archetti, Eugenio Lomurno, Diego Piccinotti, Matteo Matteucci
TL;DR
FPBoost addresses limitations of traditional survival models by modeling the instantaneous risk as a weighted sum of fully parametric hazards and optimizing the full survival likelihood rather than relying on partial likelihood or discretization. By using gradient-boosted trees to estimate head parameters $(\eta_j, k_j)$ and weights $w_j$ for each head, FPBoost achieves flexible, interpretable hazard shapes formed from Weibull and LogLogistic components, with nonnegativity enforced via $\text{ReLU}$ and final hazard clipping as needed. The paper proves a universal hazard approximation property for mixtures of Weibull heads and demonstrates strong empirical performance across diverse right-censored datasets, delivering competitive concordance and calibration relative to both tree-based and neural-network survival models, with an open-source implementation compatible with scikit-survival.
Abstract
Survival analysis is a statistical framework for modeling time-to-event data. It plays a pivotal role in medicine, reliability engineering, and social science research, where understanding event dynamics even with few data samples is critical. Recent advancements in machine learning, particularly those employing neural networks and decision trees, have introduced sophisticated algorithms for survival modeling. However, many of these methods rely on restrictive assumptions about the underlying event-time distribution, such as proportional hazard, time discretization, or accelerated failure time. In this study, we propose FPBoost, a survival model that combines a weighted sum of fully parametric hazard functions with gradient boosting. Distribution parameters are estimated with decision trees trained by maximizing the full survival likelihood. We show how FPBoost is a universal approximator of hazard functions, offering full event-time modeling flexibility while maintaining interpretability through the use of well-established parametric distributions. We evaluate concordance and calibration of FPBoost across multiple benchmark datasets, showcasing its robustness and versatility as a new tool for survival estimation.
