Extending Cox Proportional Hazards Model with Symbolic Non-Linear Log-Risk Functions for Survival Analysis
Jiaxiang Cheng, Guoqiang Hu
TL;DR
This work tackles the limitation of the Cox proportional hazards model's linear log-risk by introducing Generalized Cox Proportional Hazards (GCPH), which leverages Kolmogorov-Arnold Networks to learn a fully symbolic non-linear log-risk function $f(\mathbf{x}; \bm{\Phi}) = \sum_{v=1}^V \hat{\phi}_v(x_v)$. The method integrates a log-partial likelihood with a sparsity- and entropy-based regularization, and applies a post-training symbolification step to yield human-interpretable symbolic activations for each covariate. Experiments on synthetic and real-world survival datasets show that GCPH achieves competitive C-index and Brier scores while providing transparent, covariate-specific risk mappings, enhancing interpretability without sacrificing performance. This approach offers practical impact for domains requiring both accurate survival predictions and understandable risk explanations, addressing a key gap between deep non-linear models and traditional, interpretable survival analysis.
Abstract
The Cox proportional hazards (CPH) model has been widely applied in survival analysis to estimate relative risks across different subjects given multiple covariates. Traditional CPH models rely on a linear combination of covariates weighted with coefficients as the log-risk function, which imposes a strong and restrictive assumption, limiting generalization. Recent deep learning methods enable non-linear log-risk functions. However, they often lack interpretability due to the end-to-end training mechanisms. The implementation of Kolmogorov-Arnold Networks (KAN) offers new possibilities for extending the CPH model with fully transparent and symbolic non-linear log-risk functions. In this paper, we introduce Generalized Cox Proportional Hazards (GCPH) model, a novel method for survival analysis that leverages KAN to enable a non-linear mapping from covariates to survival outcomes in a fully symbolic manner. GCPH maintains the interpretability of traditional CPH models while allowing for the estimation of non-linear log-risk functions. Experiments conducted on both synthetic data and various public benchmarks demonstrate that GCPH achieves competitive performance in terms of prediction accuracy and exhibits superior interpretability compared to current state-of-the-art methods.
