Disease Progression and Subtype Modeling for Combined Discrete and Continuous Input Data

Sterre de Jonge; Elisabeth J. Vinke; Meike W. Vernooij; Daniel C. Alexander; Alexandra L. Young; Esther E. Bron

Disease Progression and Subtype Modeling for Combined Discrete and Continuous Input Data

Sterre de Jonge, Elisabeth J. Vinke, Meike W. Vernooij, Daniel C. Alexander, Alexandra L. Young, Esther E. Bron

TL;DR

The Mixed Events model is proposed, a novel disease progression model that handles both discrete and continuous data types and is implemented within the Subtype and Stage Inference (SuStaIn) framework, resulting in Mixed-SuStaIn, enabling subtype and progression modeling.

Abstract

Disease progression modeling provides a robust framework to identify long-term disease trajectories from short-term biomarker data. It is a valuable tool to gain a deeper understanding of diseases with a long disease trajectory, such as Alzheimer's disease. A key limitation of most disease progression models is that they are specific to a single data type (e.g., continuous data), thereby limiting their applicability to heterogeneous, real-world datasets. To address this limitation, we propose the Mixed Events model, a novel disease progression model that handles both discrete and continuous data types. This model is implemented within the Subtype and Stage Inference (SuStaIn) framework, resulting in Mixed-SuStaIn, enabling subtype and progression modeling. We demonstrate the effectiveness of Mixed-SuStaIn through simulation experiments and real-world data from the Alzheimer's Disease Neuroimaging Initiative, showing that it performs well on mixed datasets. The code is available at: https://github.com/ucl-pond/pySuStaIn.

Disease Progression and Subtype Modeling for Combined Discrete and Continuous Input Data

TL;DR

Abstract

Paper Structure (11 sections, 5 equations, 4 figures, 2 tables)

This paper contains 11 sections, 5 equations, 4 figures, 2 tables.

Introduction
Methods
Mathematical Model for Mixed Data
Subtyping
Simulation Experiments
Real-World Data Validation
Results
Simulation Experiments
Real-World Data Validation
Conclusion
References

Figures (4)

Figure 1: Example biomarker trajectory. Z-scored biomarkers (red) reach abnormality at z-scores 1–3 and accumulate at z-max = 4. The ordinal biomarker (yellow) reaches abnormality at scores 1–3, while the binary biomarker (green) transitions once, resulting in ten events.
Figure 2: Accuracy of Mixed-SuStaIn in recovering the ground truth subtype patterns on synthetic data. Error bars indicate standard deviation. Experiments settings: number of subjects ($J$, green), subtypes ($C$, purple), biomarkers ($I$, yellow) and values ($V$, red). $^\star$ indicates default values.
Figure 3: Disease progression patterns of subtype 1 (n=458) and subtype 2 (n=183) identified by Mixed-SuStaIn. The top rows show cortical regions becoming increasingly abnormal (higher z-scores) across disease stages. The “total brain” biomarker reflects global brain change, but is visualized only on cortical areas for clarity. Bottom rows depict binary progression of cerebrospinal fluid biomarkers.
Figure 4: The probability that subjects from each diagnostic category, including the proportion of converters, belong to each Mixed-SuStaIn stage for subtype 1 (a) and subtype 2 (b). Converters in the CN bars represent CN-to-MCI conversion, converters in the MCI bars represent MCI-to-AD conversion. AD=Alzheimer's Disease, CN=cognitively normal, MCI=mild cognitive impairment.

Disease Progression and Subtype Modeling for Combined Discrete and Continuous Input Data

TL;DR

Abstract

Disease Progression and Subtype Modeling for Combined Discrete and Continuous Input Data

Authors

TL;DR

Abstract

Table of Contents

Figures (4)