Optimal Survival Trees: A Dynamic Programming Approach
Tim Huisman, Jacobus G. M. van der Linden, Emir Demirović
TL;DR
The paper tackles the problem of learning globally optimal survival trees under right-censoring. It introduces SurTree, a dynamic-programming framework that yields optimal trees for a given size while providing a special depth-two algorithm to dramatically boost scalability. Empirical results show SurTree achieves competitive out-of-sample performance compared with OST and outperforms CTree, with substantial runtime advantages in many settings. The approach enables direct assessment of optimality gaps for heuristic trees and lays groundwork for further enhancements, such as incorporating Cox models within leaves. Overall, SurTree advances interpretable survival analysis by delivering global optimality guarantees with scalable optimization.
Abstract
Survival analysis studies and predicts the time of death, or other singular unrepeated events, based on historical data, while the true time of death for some instances is unknown. Survival trees enable the discovery of complex nonlinear relations in a compact human comprehensible model, by recursively splitting the population and predicting a distinct survival distribution in each leaf node. We use dynamic programming to provide the first survival tree method with optimality guarantees, enabling the assessment of the optimality gap of heuristics. We improve the scalability of our method through a special algorithm for computing trees up to depth two. The experiments show that our method's run time even outperforms some heuristics for realistic cases while obtaining similar out-of-sample performance with the state-of-the-art.
