Table of Contents
Fetching ...

Deep Kernel Aalen-Johansen Estimator: An Interpretable and Flexible Neural Net Framework for Competing Risks

Xiaobin Shen, George H. Chen

TL;DR

This work introduces the Deep Kernel Aalen-Johansen (DKAJ) estimator, a unified framework that delivers interpretable, kernel-based hazard modeling for competing risks by representing each data point as a weighted mixture of cluster exemplars whose AJ-estimated CIFs are stored per cluster. By learning a neural embedding and using a kernel to weight training points, DKAJ yields a conditional AJ-like CIF prediction for new individuals while enabling straightforward interpretability through cluster-level visualizations and exemplar contributions. The approach achieves competitive performance against state-of-the-art baselines on four standard datasets and provides practical visualization tools for understanding both group-level and individual-level risk trajectories. The paper also establishes connections to classical AJ theory via a likelihood-based interpretation and discusses extensions (kernel choices, clustering schemes, and multistate processes) that preserve interpretability while expanding flexibility.

Abstract

We propose an interpretable deep competing risks model called the Deep Kernel Aalen-Johansen (DKAJ) estimator, which generalizes the classical Aalen-Johansen nonparametric estimate of cumulative incidence functions (CIFs). Each data point (e.g., patient) is represented as a weighted combination of clusters. If a data point has nonzero weight only for one cluster, then its predicted CIFs correspond to those of the classical Aalen-Johansen estimator restricted to data points from that cluster. These weights come from an automatically learned kernel function that measures how similar any two data points are. On four standard competing risks datasets, we show that DKAJ is competitive with state-of-the-art baselines while being able to provide visualizations to assist model interpretation.

Deep Kernel Aalen-Johansen Estimator: An Interpretable and Flexible Neural Net Framework for Competing Risks

TL;DR

This work introduces the Deep Kernel Aalen-Johansen (DKAJ) estimator, a unified framework that delivers interpretable, kernel-based hazard modeling for competing risks by representing each data point as a weighted mixture of cluster exemplars whose AJ-estimated CIFs are stored per cluster. By learning a neural embedding and using a kernel to weight training points, DKAJ yields a conditional AJ-like CIF prediction for new individuals while enabling straightforward interpretability through cluster-level visualizations and exemplar contributions. The approach achieves competitive performance against state-of-the-art baselines on four standard datasets and provides practical visualization tools for understanding both group-level and individual-level risk trajectories. The paper also establishes connections to classical AJ theory via a likelihood-based interpretation and discusses extensions (kernel choices, clustering schemes, and multistate processes) that preserve interpretability while expanding flexibility.

Abstract

We propose an interpretable deep competing risks model called the Deep Kernel Aalen-Johansen (DKAJ) estimator, which generalizes the classical Aalen-Johansen nonparametric estimate of cumulative incidence functions (CIFs). Each data point (e.g., patient) is represented as a weighted combination of clusters. If a data point has nonzero weight only for one cluster, then its predicted CIFs correspond to those of the classical Aalen-Johansen estimator restricted to data points from that cluster. These weights come from an automatically learned kernel function that measures how similar any two data points are. On four standard competing risks datasets, we show that DKAJ is competitive with state-of-the-art baselines while being able to provide visualizations to assist model interpretation.

Paper Structure

This paper contains 51 sections, 1 theorem, 49 equations, 11 figures, 12 tables.

Key Result

Proposition 1

Using the notation above where $0<t_1<\cdots<t_L$ denote the unique times in which any critical event happens, suppose that we parameterize the event-specific hazard function to be piecewise constant on the $L$ intervals $(t_0,t_1], (t_1,t_2], \dots, (t_{L-1},t_L]$: where $\phi_{\delta,\ell}\in[0,\infty)$ for $\delta\in[m], \ell\in[L]$ are the parameters. These parameters do not depend on $x$, so

Figures (11)

  • Figure 1: $C^{\text{td}}$ on the synthetic dataset (Event 1) as training set size varies; 30% held-out test
  • Figure 2: (Framingham) CIFs for the largest 5 clusters (we then sort these 5 clusters in decreasing order by the estimated probability of CVD death happening within the maximum observed time). Clusters correspond across the two plots in this figure as well as in Figure \ref{['fig:clusters-heatmap-top5-framingham']} (e.g., the blue cluster is the same cluster across these visualizations).
  • Figure 3: (Framingham) Feature heatmap summarizing distributions of variables in the same 5 clusters as in Figure \ref{['fig:clusters-aj-top5-framingham']}. Darker shades mean higher feature values or frequencies.
  • Figure G.1: $C^{\text{td}}$ on the synthetic dataset (Event 2) as training set size varies; 30% held-out test.
  • Figure H.1: (Framingham, individual-level) Top 5 clusters with the highest weights assigned to a randomly chosen test patient. We show the AJ curves for the 5 clusters as well as the predicted CIF for this test patient. Clusters correspond across the two plots in this figure as well as in Figure \ref{['fig:individual-heatmap-framingham-top-5']}.
  • ...and 6 more figures

Theorems & Definitions (1)

  • Proposition 1