Table of Contents
Fetching ...

Longitudinal Progression Prediction of Alzheimer's Disease with Tabular Foundation Model

Yilang Ding, Jiawen Ren, Jiaying Lu, Gloria Hyunjung Kwak, Armin Iraji, Shengpu Tang, Alex Fedorov

TL;DR

This work tackles the challenge of forecasting Alzheimer’s disease progression from multimodal longitudinal data by introducing L2C-TabPFN, which first converts irregular longitudinal records into fixed-length cross-sectional vectors via a longitudinal-to-cross-sectional transformation and then applies a TabPFN pretrained on synthetic tabular data with in-context learning. The approach is evaluated on the TADPOLE dataset, focusing on three outcomes: diagnostic status, cognitive scores, and ventricular volume, with ventricular volume emerging as the strongest area of predictive gain. Compared with the Frog baseline (an XGBoost-based method), L2C-TabPFN achieves competitive performance on diagnosis and cognition but delivers state-of-the-art accuracy for ventricular volume forecasting, highlighting the promise of transformer-based tabular models for imaging biomarker prediction. SHAP-based interpretability analyses show clinically relevant feature contributions and reveal task-dependent attribution patterns, underscoring the practical potential and interpretability of using tabular foundation models for longitudinal Alzheimer's disease prediction.

Abstract

Alzheimer's disease is a progressive neurodegenerative disorder that remains challenging to predict due to its multifactorial etiology and the complexity of multimodal clinical data. Accurate forecasting of clinically relevant biomarkers, including diagnostic and quantitative measures, is essential for effective monitoring of disease progression. This work introduces L2C-TabPFN, a method that integrates a longitudinal-to-cross-sectional (L2C) transformation with a pre-trained Tabular Foundation Model (TabPFN) to predict Alzheimer's disease outcomes using the TADPOLE dataset. L2C-TabPFN converts sequential patient records into fixed-length feature vectors, enabling robust prediction of diagnosis, cognitive scores, and ventricular volume. Experimental results demonstrate that, while L2C-TabPFN achieves competitive performance on diagnostic and cognitive outcomes, it provides state-of-the-art results in ventricular volume prediction. This key imaging biomarker reflects neurodegeneration and progression in Alzheimer's disease. These findings highlight the potential of tabular foundational models for advancing longitudinal prediction of clinically relevant imaging markers in Alzheimer's disease.

Longitudinal Progression Prediction of Alzheimer's Disease with Tabular Foundation Model

TL;DR

This work tackles the challenge of forecasting Alzheimer’s disease progression from multimodal longitudinal data by introducing L2C-TabPFN, which first converts irregular longitudinal records into fixed-length cross-sectional vectors via a longitudinal-to-cross-sectional transformation and then applies a TabPFN pretrained on synthetic tabular data with in-context learning. The approach is evaluated on the TADPOLE dataset, focusing on three outcomes: diagnostic status, cognitive scores, and ventricular volume, with ventricular volume emerging as the strongest area of predictive gain. Compared with the Frog baseline (an XGBoost-based method), L2C-TabPFN achieves competitive performance on diagnosis and cognition but delivers state-of-the-art accuracy for ventricular volume forecasting, highlighting the promise of transformer-based tabular models for imaging biomarker prediction. SHAP-based interpretability analyses show clinically relevant feature contributions and reveal task-dependent attribution patterns, underscoring the practical potential and interpretability of using tabular foundation models for longitudinal Alzheimer's disease prediction.

Abstract

Alzheimer's disease is a progressive neurodegenerative disorder that remains challenging to predict due to its multifactorial etiology and the complexity of multimodal clinical data. Accurate forecasting of clinically relevant biomarkers, including diagnostic and quantitative measures, is essential for effective monitoring of disease progression. This work introduces L2C-TabPFN, a method that integrates a longitudinal-to-cross-sectional (L2C) transformation with a pre-trained Tabular Foundation Model (TabPFN) to predict Alzheimer's disease outcomes using the TADPOLE dataset. L2C-TabPFN converts sequential patient records into fixed-length feature vectors, enabling robust prediction of diagnosis, cognitive scores, and ventricular volume. Experimental results demonstrate that, while L2C-TabPFN achieves competitive performance on diagnostic and cognitive outcomes, it provides state-of-the-art results in ventricular volume prediction. This key imaging biomarker reflects neurodegeneration and progression in Alzheimer's disease. These findings highlight the potential of tabular foundational models for advancing longitudinal prediction of clinically relevant imaging markers in Alzheimer's disease.

Paper Structure

This paper contains 14 sections, 2 equations, 4 figures, 6 tables.

Figures (4)

  • Figure 1: Illustration of the longitudinal-to-cross-sectional (L2C) feature transformation and forecasting process. Longitudinal multimodal features such as diagnosis (DX), ADAS scores, ventricular volumes, intracranial volume (ICV), and CDRSB are organized over multiple timepoints. The L2C transformation converts these into a fixed-length feature vector, which serves as input to the forecasting model. The model then predicts clinical and imaging outcomes for future timepoints.
  • Figure 2: Overview of the experimental pipeline for L2C-TabPFN forecasting. The process starts with the TADPOLE dataset and proceeds through data cleaning, preprocessing, and longitudinal-to-cross-sectional (L2C) feature transformation. The pipeline then creates cross-sectional training and evaluation sets, applies data augmentation to the training set, and trains the TabPFN model. Model performance is evaluated on the holdout set D2.
  • Figure 3: SHAP summary plots for feature importance in the ventricles, diagnosis (DX), and ADAS models for L2C-TabPFN. Each point represents the SHAP value for a specific feature in a single sample. Feature importance is ranked from top to bottom, with color indicating the feature value (blue for low and pink for high). The plots highlight which features have the greatest impact on model predictions across all samples.
  • Figure 4: SHAP summary plots for feature importance in the ventricles, diagnosis (DX), and ADAS models for Frog. Each point represents the SHAP value for a specific feature in a single sample. Feature importance is ranked from top to bottom, with color indicating the feature value (blue for low and pink for high). The plots highlight which features have the greatest impact on model predictions across all samples.