Table of Contents
Fetching ...

HACSurv: A Hierarchical Copula-Based Approach for Survival Analysis with Dependent Competing Risks

Xin Liu, Weijia Zhang, Min-Ling Zhang

TL;DR

This work tackles survival analysis under dependent competing risks and informative censoring by learning Hierarchical Archimedean Copulas (HACs) to flexibly model asymmetric dependencies. HACSurv combines HAC structure learning with MONDE-based marginal survival estimation and derives a CIF-based prediction that accounts for dependence via $F_{k^*}(t^*) = 1 - \frac{C(\{u_i\})}{C(\{u_i\}_{i \neq k^*})}$, where $u_i = S_{T_i|X}(t^*|x^*)$. The authors show that jointly learning the HAC and marginals reduces bias in survival distributions and achieves state-of-the-art performance on synthetic and real-world datasets, while also offering insights into disease interdependencies. The method supports end-to-end training, flexible structure discovery, and even allows specifying HACs when prior knowledge is available; code is provided for reproducibility.

Abstract

In survival analysis, subjects often face competing risks; for example, individuals with cancer may also suffer from heart disease or other illnesses, which can jointly influence the prognosis of risks and censoring. Traditional survival analysis methods often treat competing risks as independent and fail to accommodate the dependencies between different conditions. In this paper, we introduce HACSurv, a survival analysis method that learns Hierarchical Archimedean Copulas structures and cause-specific survival functions from data with competing risks. HACSurv employs a flexible dependency structure using hierarchical Archimedean copulas to represent the relationships between competing risks and censoring. By capturing the dependencies between risks and censoring, HACSurv improves the accuracy of survival predictions and offers insights into risk interactions. Experiments on synthetic dataset demonstrate that our method can accurately identify the complex dependency structure and precisely predict survival distributions, whereas the compared methods exhibit significant deviations between their predictions and the true distributions. Experiments on multiple real-world datasets also demonstrate that our method achieves better survival prediction compared to previous state-of-the-art methods.

HACSurv: A Hierarchical Copula-Based Approach for Survival Analysis with Dependent Competing Risks

TL;DR

This work tackles survival analysis under dependent competing risks and informative censoring by learning Hierarchical Archimedean Copulas (HACs) to flexibly model asymmetric dependencies. HACSurv combines HAC structure learning with MONDE-based marginal survival estimation and derives a CIF-based prediction that accounts for dependence via , where . The authors show that jointly learning the HAC and marginals reduces bias in survival distributions and achieves state-of-the-art performance on synthetic and real-world datasets, while also offering insights into disease interdependencies. The method supports end-to-end training, flexible structure discovery, and even allows specifying HACs when prior knowledge is available; code is provided for reproducibility.

Abstract

In survival analysis, subjects often face competing risks; for example, individuals with cancer may also suffer from heart disease or other illnesses, which can jointly influence the prognosis of risks and censoring. Traditional survival analysis methods often treat competing risks as independent and fail to accommodate the dependencies between different conditions. In this paper, we introduce HACSurv, a survival analysis method that learns Hierarchical Archimedean Copulas structures and cause-specific survival functions from data with competing risks. HACSurv employs a flexible dependency structure using hierarchical Archimedean copulas to represent the relationships between competing risks and censoring. By capturing the dependencies between risks and censoring, HACSurv improves the accuracy of survival predictions and offers insights into risk interactions. Experiments on synthetic dataset demonstrate that our method can accurately identify the complex dependency structure and precisely predict survival distributions, whereas the compared methods exhibit significant deviations between their predictions and the true distributions. Experiments on multiple real-world datasets also demonstrate that our method achieves better survival prediction compared to previous state-of-the-art methods.

Paper Structure

This paper contains 33 sections, 2 theorems, 24 equations, 5 figures, 8 tables.

Key Result

Theorem 1

For a $d$-variate cumulative distribution function $F$, with $j$-th univariate margin $F_j$, the copula associated with $F$ is a cumulative distribution function $C:[0,1]^d \rightarrow [0,1]$ with $U(0,1)$ margins satisfying: where $(x_1, \cdots, x_d) \in \mathbb{R}^d$. If $F$ is continuous, then $C$ is unique.

Figures (5)

  • Figure 1: Overview of our HACSurv for the two competing risks scenario. We abbreviate $S_{T_k \mid X}$ as $S_k$, for $k = 0, 1, 2$. The copula (out) can be represented by the outer generator $\varphi_0$. The inner generator $\varphi_1$ corresponding to copula (in) is constructed from the Laplace exponent $\psi_1$ and the outer generator $\varphi_0$.
  • Figure 2: The blue samples are generated from the HAC learned on the synthetic dataset. The samples drawn from the ground truth copulas are in black. The results are presented in a mirrored format.
  • Figure 3: Copulas learned by HACSurv from Framingham and MIMIC-III dataset. (a), (b) and (c) are the copulas between Risk 1 and Risk 2, Risk 1 and censoring, and Risk 2 and censoring on the Framingham dataset. (d), (e) and (f) are the copulas between Risk 1 and Risk 3, Risk 3 and Risk 5, and Risk 2 and Risk 4 on MIMIC-III, respectively.
  • Figure 4: The HAC hierarchy determined using the HACSurv framework: (a) Framingham and SEER datasets, (b) synthetic dataset, and (c) MIMIC-III dataset.
  • Figure 5: Inner Copulas Learned by HACsurv. (a), (b) and (c) are the copulas learned in the first stage. (d), (e) and (f) are the corresponding inner copulas learned in the second stage using the Re-generation Trick

Theorems & Definitions (2)

  • Theorem 1: Sklar's theorem
  • Theorem 2: Bernstein-Widder