R-GenIMA: Integrating Neuroimaging and Genetics with Interpretable Multimodal AI for Alzheimer's Disease Progression
Kun Zhao, Siyuan Dai, Yingying Zhang, Guodong Liu, Pengfei Gu, Chenghua Lin, Paul M. Thompson, Alex Leow, Heng Huang, Lifang He, Liang Zhan, Haoteng Tang
TL;DR
The paper introduces R-GenIMA, an interpretable multimodal AI that fuses ROI-wise MRI tokens with SNP prompts via a large language model to model Alzheimer's disease progression across NC, SMC, MCI, and AD. By preserving regional anatomical detail with RiT and aligning genetic variation through structured prompts, the approach enables cross-modal reasoning and yields biologically meaningful ROI–gene associations. On the ADNI dataset, R-GenIMA achieves state-of-the-art four-way classification, with the Mixture Data configuration delivering near-perfect accuracy and the model-prioritized genes showing enrichment for established AD risk loci. The results reveal stage-specific neuroanatomical signatures and cohesive ROI–gene patterns that reflect a progression from synaptic vulnerability to network disruption, highlighting the potential for early risk stratification and mechanistic insight while acknowledging generalizability and data-panel limitations.
Abstract
Early detection of Alzheimer's disease (AD) requires models capable of integrating macro-scale neuroanatomical alterations with micro-scale genetic susceptibility, yet existing multimodal approaches struggle to align these heterogeneous signals. We introduce R-GenIMA, an interpretable multimodal large language model that couples a novel ROI-wise vision transformer with genetic prompting to jointly model structural MRI and single nucleotide polymorphisms (SNPs) variations. By representing each anatomically parcellated brain region as a visual token and encoding SNP profiles as structured text, the framework enables cross-modal attention that links regional atrophy patterns to underlying genetic factors. Applied to the ADNI cohort, R-GenIMA achieves state-of-the-art performance in four-way classification across normal cognition (NC), subjective memory concerns (SMC), mild cognitive impairment (MCI), and AD. Beyond predictive accuracy, the model yields biologically meaningful explanations by identifying stage-specific brain regions and gene signatures, as well as coherent ROI-Gene association patterns across the disease continuum. Attention-based attribution revealed genes consistently enriched for established GWAS-supported AD risk loci, including APOE, BIN1, CLU, and RBFOX1. Stage-resolved neuroanatomical signatures identified shared vulnerability hubs across disease stages alongside stage-specific patterns: striatal involvement in subjective decline, frontotemporal engagement during prodromal impairment, and consolidated multimodal network disruption in AD. These results demonstrate that interpretable multimodal AI can synthesize imaging and genetics to reveal mechanistic insights, providing a foundation for clinically deployable tools that enable earlier risk stratification and inform precision therapeutic strategies in Alzheimer's disease.
