Case-Base Neural Networks: survival analysis with time-varying, higher-order interactions
Jesse Islam, Maxime Turgeon, Robert Sladek, Sahir Bhatnagar
TL;DR
CBNNs address non-proportional hazards in single-event survival by incorporating time as an input within a case-base sampling framework, enabling data-driven estimation of time-varying interactions and a flexible baseline hazard. After case-base sampling, a standard feed-forward neural network predicts event odds with an offset to correct sampling bias, yielding a full hazard function that can be transformed into survival probabilities. Across simulations and real-world case studies, CBNNs outperform several regression and neural-network baselines in complex scenarios and two of three datasets, while remaining competitive in the third; this demonstrates a simple, flexible, and easily implementable approach to time-varying survival modeling. The work provides practical software, including R and Python implementations, and demonstrates how case-base sampling can extend neural networks to censored survival data without specialized loss functions. Overall, CBNNs offer a user-friendly, scalable framework for learning time-varying effects and complex baselines in single-event survival analyses with censoring.
Abstract
In the context of survival analysis, data-driven neural network-based methods have been developed to model complex covariate effects. While these methods may provide better predictive performance than regression-based approaches, not all can model time-varying interactions and complex baseline hazards. To address this, we propose Case-Base Neural Networks (CBNNs) as a new approach that combines the case-base sampling framework with flexible neural network architectures. Using a novel sampling scheme and data augmentation to naturally account for censoring, we construct a feed-forward neural network that includes time as an input. CBNNs predict the probability of an event occurring at a given moment to estimate the full hazard function. We compare the performance of CBNNs to regression and neural network-based survival methods in a simulation and three case studies using two time-dependent metrics. First, we examine performance on a simulation involving a complex baseline hazard and time-varying interactions to assess all methods, with CBNN outperforming competitors. Then, we apply all methods to three real data applications, with CBNNs outperforming the competing models in two studies and showing similar performance in the third. Our results highlight the benefit of combining case-base sampling with deep learning to provide a simple and flexible framework for data-driven modeling of single event survival outcomes that estimates time-varying effects and a complex baseline hazard by design. An R package is available at https://github.com/Jesse-Islam/cbnn.
