Deep Clustering Survival Machines with Interpretable Expert Distributions

Bojian Hou; Hongming Li; Zhicheng Jiao; Zhen Zhou; Hao Zheng; Yong Fan

Deep Clustering Survival Machines with Interpretable Expert Distributions

Bojian Hou, Hongming Li, Zhicheng Jiao, Zhen Zhou, Hao Zheng, Yong Fan

TL;DR

The paper tackles heterogeneity in time-to-event data by introducing Deep Clustering Survival Machines (DCSM), which express the conditional survival distribution as a weighted mixture of K constant Weibull experts. An encoder maps features X to latent representations, producing instance-specific weights α_k = softmax((w^T φ_θ(X))_k) so that P(T|X) = Σ_k α_k P(T|μ_k, σ_k), with risk inferred at a horizon t_max via r_i = 1 - Σ_k α_k CDF(t_max|μ_k, σ_k) and CDF(t) = exp(- (t/σ_k)^{μ_k}). The training objective blends a prior on μ_k, σ_k with ELBO terms for uncensored and censored data, defined as L_all = L_prior - ELBO_U(Θ) - λ ELBO_C(Θ), enabling simultaneous time-to-event prediction and clustering by the dominant expert. Empirical results on four real datasets and 36 synthetic datasets show competitive predictive performance (C-index) and superior clustering quality (LogRank), with the learned expert distributions mirroring Kaplan–Meier curves and enhancing interpretability for personalized prognosis.

Abstract

Conventional survival analysis methods are typically ineffective to characterize heterogeneity in the population while such information can be used to assist predictive modeling. In this study, we propose a hybrid survival analysis method, referred to as deep clustering survival machines, that combines the discriminative and generative mechanisms. Similar to the mixture models, we assume that the timing information of survival data is generatively described by a mixture of certain numbers of parametric distributions, i.e., expert distributions. We learn weights of the expert distributions for individual instances according to their features discriminatively such that each instance's survival information can be characterized by a weighted combination of the learned constant expert distributions. This method also facilitates interpretable subgrouping/clustering of all instances according to their associated expert distributions. Extensive experiments on both real and synthetic datasets have demonstrated that the method is capable of obtaining promising clustering results and competitive time-to-event predicting performance.

Deep Clustering Survival Machines with Interpretable Expert Distributions

TL;DR

Abstract

Paper Structure (10 sections, 8 equations, 2 figures, 2 tables, 1 algorithm)

This paper contains 10 sections, 8 equations, 2 figures, 2 tables, 1 algorithm.

Introduction
Methods
Experiments
Datasets
Baseline Methods, Metrics and Settings
Quantitative Results on Real Data
Quantitative Results on Synthetic Data
Qualitative Results on Real Data
Conclusion
Compliance with ethical standards

Figures (2)

Figure 1: The model structure of the proposed DCSM. Part 1 learns each instance's survival function by a weighted combination of the expert distributions. Part 2 clusters instances by the learned weights allocated to each expert distribution.
Figure 2: (a) The C Index comparison among the 36 synthetic datasets. A radar plot is used to illustrate the performance comparison. A bigger area means better performance. We fill the area of our method and we can see that on most synthetic datasets (30 among 36), the baseline methods’ curves fall inside our method. (b-g) The Kaplan-Meier plots of all the methods on data PBC. The cross mark means censoring. The learned expert distributions are shown in (h). The shape of the two expert distributions resembles our Kaplan-Meier curves, facilitating effective data stratification.

Deep Clustering Survival Machines with Interpretable Expert Distributions

TL;DR

Abstract

Deep Clustering Survival Machines with Interpretable Expert Distributions

Authors

TL;DR

Abstract

Table of Contents

Figures (2)