Table of Contents
Fetching ...

Bayesian Event-Based Model for Disease Subtype and Stage Inference

Hongtao Hao, Joseph L. Austerweil

TL;DR

The paper addresses robustness in modeling heterogeneous disease progression by introducing bebms, a Bayesian event-based model that infers the number of subtypes, their progression orderings, and patient staging from cross-sectional data. Bebms combines a Bayesian formulation with MCMC inference to jointly estimate biomarker distributions and subtype/stage priors, and employs cross-validation to select the number of subtypes. Across extensive synthetic misspecifications and real ADNI data, bebms improves ordering, staging, and subtype assignment relative to SuStaIn while reducing computational cost, and yields subtype patterns that align more closely with established Alzheimer's progression. These findings suggest bebms enhances patient stratification and clinical trial design in heterogeneous chronic diseases while offering greater robustness to data imperfections.

Abstract

Chronic diseases often progress differently across patients. Rather than randomly varying, there are typically a small number of subtypes for how a disease progresses across patients. To capture this structured heterogeneity, the Subtype and Stage Inference Event-Based Model (SuStaIn) estimates the number of subtypes, the order of disease progression for each subtype, and assigns each patient to a subtype from primarily cross-sectional data. It has been widely applied to uncover the subtypes of many diseases and inform our understanding of them. But how robust is its performance? In this paper, we develop a principled Bayesian subtype variant of the event-based model (BEBMS) and compare its performance to SuStaIn in a variety of synthetic data experiments with varied levels of model misspecification. BEBMS substantially outperforms SuStaIn across ordering, staging, and subtype assignment tasks. Further, we apply BEBMS and SuStaIn to a real-world Alzheimer's data set. We find BEBMS has results that are more consistent with the scientific consensus of Alzheimer's disease progression than SuStaIn.

Bayesian Event-Based Model for Disease Subtype and Stage Inference

TL;DR

The paper addresses robustness in modeling heterogeneous disease progression by introducing bebms, a Bayesian event-based model that infers the number of subtypes, their progression orderings, and patient staging from cross-sectional data. Bebms combines a Bayesian formulation with MCMC inference to jointly estimate biomarker distributions and subtype/stage priors, and employs cross-validation to select the number of subtypes. Across extensive synthetic misspecifications and real ADNI data, bebms improves ordering, staging, and subtype assignment relative to SuStaIn while reducing computational cost, and yields subtype patterns that align more closely with established Alzheimer's progression. These findings suggest bebms enhances patient stratification and clinical trial design in heterogeneous chronic diseases while offering greater robustness to data imperfections.

Abstract

Chronic diseases often progress differently across patients. Rather than randomly varying, there are typically a small number of subtypes for how a disease progresses across patients. To capture this structured heterogeneity, the Subtype and Stage Inference Event-Based Model (SuStaIn) estimates the number of subtypes, the order of disease progression for each subtype, and assigns each patient to a subtype from primarily cross-sectional data. It has been widely applied to uncover the subtypes of many diseases and inform our understanding of them. But how robust is its performance? In this paper, we develop a principled Bayesian subtype variant of the event-based model (BEBMS) and compare its performance to SuStaIn in a variety of synthetic data experiments with varied levels of model misspecification. BEBMS substantially outperforms SuStaIn across ordering, staging, and subtype assignment tasks. Further, we apply BEBMS and SuStaIn to a real-world Alzheimer's data set. We find BEBMS has results that are more consistent with the scientific consensus of Alzheimer's disease progression than SuStaIn.

Paper Structure

This paper contains 29 sections, 22 equations, 21 figures, 6 tables.

Figures (21)

  • Figure 1: bebms as a graphical model.
  • Figure 2: Normalized Kendall’s $\tau$ across all nine synthetic experiments. Each panel corresponds to an experiment; within each panel, participant sizes ($J$) are shown across columns, and within each column three healthy ratios ($R=0.25, 0.5, 0.75$) are displayed from left to right. bebms reduced ordering error by $27.3\%$ relative to SuStaIn, with bebms (Blind) performing nearly identically. SuStaIn results were consistently lower, with margins narrowing under model misspecification (Experiments 8–9). Performance was largely insensitive to participant size and healthy ratio.
  • Figure 3: bebms ADNI ordering. All subtypes begin in the entorhinal region, with Subtype 1 showing early cognitive decline, Subtype 2 early CSF changes, and Subtype 3 early neurodegeneration.
  • Figure 4: bebms Metropolis--Hastings Sampler
  • Figure 5: (1) Theoretical normal distributions; (2) Theoretical non-normal distributions; (3) Empirical distributions in one synthetic dataset of Exp. 9; (4) Empirical distributions in one synthetic dataset of Exp. 1.
  • ...and 16 more figures