Table of Contents
Fetching ...

Rethinking Individual Risk and Aggregation in Survival Analysis: A Latent Mechanism Framework

Xijia Liu

Abstract

Survival analysis provides a well-established framework for modeling time-to-event data, with hazard and survival functions formally defined as population-level quantities. In applied work, however, these quantities are often interpreted as representing individual-level risk, despite the absence of a clear generative account linking individual risk mechanisms to observed survival data. This paper develops a latent hazard framework that makes this relationship explicit by modeling event times as arising from unobserved, individual-specific hazard mechanisms and viewing population-level survival quantities as aggregates over heterogeneous mechanisms. Within this framework, we show that individual hazard trajectories are not identifiable from survival data under partial information. More generally, the conditional distribution of latent hazard mechanisms given covariates is structurally non-identifiable, even when population-level survival functions are fully known. This non-identifiability arises from the aggregation inherent in survival data and persists independently of model flexibility or estimation strategy. Finally, we show that classical survival models can be systematically reinterpreted according to how they handle this unresolved conditional mechanism distribution. This paper provides a unified framework for understanding heterogeneity, identifiability, and interpretation in survival analysis, and clarifies how population-level survival models should be interpreted when individual risk mechanisms are only partially observed, thereby establishing explicit information constraints for principled modeling and inference.

Rethinking Individual Risk and Aggregation in Survival Analysis: A Latent Mechanism Framework

Abstract

Survival analysis provides a well-established framework for modeling time-to-event data, with hazard and survival functions formally defined as population-level quantities. In applied work, however, these quantities are often interpreted as representing individual-level risk, despite the absence of a clear generative account linking individual risk mechanisms to observed survival data. This paper develops a latent hazard framework that makes this relationship explicit by modeling event times as arising from unobserved, individual-specific hazard mechanisms and viewing population-level survival quantities as aggregates over heterogeneous mechanisms. Within this framework, we show that individual hazard trajectories are not identifiable from survival data under partial information. More generally, the conditional distribution of latent hazard mechanisms given covariates is structurally non-identifiable, even when population-level survival functions are fully known. This non-identifiability arises from the aggregation inherent in survival data and persists independently of model flexibility or estimation strategy. Finally, we show that classical survival models can be systematically reinterpreted according to how they handle this unresolved conditional mechanism distribution. This paper provides a unified framework for understanding heterogeneity, identifiability, and interpretation in survival analysis, and clarifies how population-level survival models should be interpreted when individual risk mechanisms are only partially observed, thereby establishing explicit information constraints for principled modeling and inference.

Paper Structure

This paper contains 35 sections, 4 theorems, 77 equations.

Key Result

Theorem 2.3

Let $\Theta$ be an individual hazard mechanism with induced hazard trajectory $h_\Theta(t)$ and associated survival curve $S_\Theta(t)$. Let $X$ be an observable covariate interpreted as partial information about $\Theta$. Then the following statements hold.

Theorems & Definitions (13)

  • Definition 2.1: Individual hazard mechanism
  • Remark 2.1: Mechanism space versus function space
  • Remark 2.2: Mechanism stability
  • Definition 2.2: Observable information
  • Theorem 2.3: Representation of observable survival and hazard
  • Theorem 3.2: Structural non-identifiability at the individual level
  • proof
  • proof
  • Proposition C.1: Mechanism-level implications of proportional hazards
  • proof
  • ...and 3 more