Survival Concept-Based Learning Models
Stanislav R. Kirpichenko, Lev V. Utkin, Andrei V. Konstantinov, Natalya M. Verbova
TL;DR
This work addresses survival analysis with censored data by introducing two concept-based survival models, SurvCBM and SurvRCM, that integrate a concept bottleneck with Cox or Beran survival predictors. SurvCBM jointly learns concept predictions and survival functions end-to-end, enabling interpretable predictions in terms human-understandable concepts. The paper develops two interpretability pathways: linear concept contributions under the Cox model and instance-based explanations via the Beran estimator, and demonstrates that SurvCBM consistently outperforms alternatives across synthetic MNIST/CIFAR-10 datasets. The findings establish the value of incorporating concept information into survival analysis for improved accuracy and interpretability, with public code provided for replication and extension.
Abstract
Concept-based learning enhances prediction accuracy and interpretability by leveraging high-level, human-understandable concepts. However, existing CBL frameworks do not address survival analysis tasks, which involve predicting event times in the presence of censored data -- a common scenario in fields like medicine and reliability analysis. To bridge this gap, we propose two novel models: SurvCBM (Survival Concept-based Bottleneck Model) and SurvRCM (Survival Regularized Concept-based Model), which integrate concept-based learning with survival analysis to handle censored event time data. The models employ the Cox proportional hazards model and the Beran estimator. SurvCBM is based on the architecture of the well-known concept bottleneck model, offering interpretable predictions through concept-based explanations. SurvRCM uses concepts as regularization to enhance accuracy. Both models are trained end-to-end and provide interpretable predictions in terms of concepts. Two interpretability approaches are proposed: one leveraging the linear relationship in the Cox model and another using an instance-based explanation framework with the Beran estimator. Numerical experiments demonstrate that SurvCBM outperforms SurvRCM and traditional survival models, underscoring the importance and advantages of incorporating concept information. The code for the proposed algorithms is publicly available.
