Table of Contents
Fetching ...

GLACIAL: Granger and Learning-based Causality Analysis for Longitudinal Imaging Studies

Minh Nguyen, Gia H. Ngo, Mert R. Sabuncu

TL;DR

GLACIAL extends Granger causality to sparse, multi-subject longitudinal imaging by combining GC with a single multi-task neural forecaster and model-interpolation to handle missing data. Causality is inferred by testing whether including a potential driver improves predictive accuracy, quantified through $\Delta\mathsf{MSE}$ across held-out subjects, with post-processing to resolve indirect or bidirectional ambiguities. The method is shown to outperform a wide range of baselines on simulated data with nonlinear dynamics and missingness, and on the real-world ADNI dataset, producing interpretable causal graphs aligned with established Alzheimer's disease progression patterns. This approach offers a scalable, flexible tool for causal discovery in realistic, multimodal longitudinal studies, with potential extensions to more advanced predictors and missingness models.

Abstract

The Granger framework is useful for discovering causal relations in time-varying signals. However, most Granger causality (GC) methods are developed for densely sampled timeseries data. A substantially different setting, particularly common in medical imaging, is the longitudinal study design, where multiple subjects are followed and sparsely observed over time. Longitudinal studies commonly track several biomarkers, which are likely governed by nonlinear dynamics that might have subject-specific idiosyncrasies and exhibit both direct and indirect causes. Furthermore, real-world longitudinal data often suffer from widespread missingness. GC methods are not well-suited to handle these issues. In this paper, we propose an approach named GLACIAL (Granger and LeArning-based CausalIty Analysis for Longitudinal studies) to fill this methodological gap by marrying GC with a multi-task neural forecasting model. GLACIAL treats subjects as independent samples and uses the model's average prediction accuracy on hold-out subjects to probe causal links. Input dropout and model interpolation are used to efficiently learn nonlinear dynamic relationships between a large number of variables and to handle missing values respectively. Extensive simulations and experiments on a real longitudinal medical imaging dataset show GLACIAL beating competitive baselines and confirm its utility. Our code is available at https://github.com/mnhng/GLACIAL.

GLACIAL: Granger and Learning-based Causality Analysis for Longitudinal Imaging Studies

TL;DR

GLACIAL extends Granger causality to sparse, multi-subject longitudinal imaging by combining GC with a single multi-task neural forecaster and model-interpolation to handle missing data. Causality is inferred by testing whether including a potential driver improves predictive accuracy, quantified through across held-out subjects, with post-processing to resolve indirect or bidirectional ambiguities. The method is shown to outperform a wide range of baselines on simulated data with nonlinear dynamics and missingness, and on the real-world ADNI dataset, producing interpretable causal graphs aligned with established Alzheimer's disease progression patterns. This approach offers a scalable, flexible tool for causal discovery in realistic, multimodal longitudinal studies, with potential extensions to more advanced predictors and missingness models.

Abstract

The Granger framework is useful for discovering causal relations in time-varying signals. However, most Granger causality (GC) methods are developed for densely sampled timeseries data. A substantially different setting, particularly common in medical imaging, is the longitudinal study design, where multiple subjects are followed and sparsely observed over time. Longitudinal studies commonly track several biomarkers, which are likely governed by nonlinear dynamics that might have subject-specific idiosyncrasies and exhibit both direct and indirect causes. Furthermore, real-world longitudinal data often suffer from widespread missingness. GC methods are not well-suited to handle these issues. In this paper, we propose an approach named GLACIAL (Granger and LeArning-based CausalIty Analysis for Longitudinal studies) to fill this methodological gap by marrying GC with a multi-task neural forecasting model. GLACIAL treats subjects as independent samples and uses the model's average prediction accuracy on hold-out subjects to probe causal links. Input dropout and model interpolation are used to efficiently learn nonlinear dynamic relationships between a large number of variables and to handle missing values respectively. Extensive simulations and experiments on a real longitudinal medical imaging dataset show GLACIAL beating competitive baselines and confirm its utility. Our code is available at https://github.com/mnhng/GLACIAL.
Paper Structure (33 sections, 25 equations, 17 figures, 2 tables, 2 algorithms)

This paper contains 33 sections, 25 equations, 17 figures, 2 tables, 2 algorithms.

Figures (17)

  • Figure 1: GLACIAL. Overview of the proposed approach for longitudinal studies.
  • Figure 2: Simulation. (A) 7-node graph having all basic structures (chain, fork, collider). (B) Subject with random-walk trajectories and linear SCM (data before standardizing to zero mean and unit variance). Only timepoints under vertical lines are observed. (C) Subject with sigmoid trajectories and linear SCM. (D) More realistic 39-node graph resembling the RTK/RAS signaling pathway. Nodes in the same cluster have the same causal relations.
  • Figure 3: Average F1-scores at different settings of sample path, lag-time and measurement noise (7-node graph). GLACIAL outperforms baselines in most settings (see Appendix \ref{['app:more_comparisons']} for more comparisons).
  • Figure 4: Average F1-scores at different settings of lag-time and measurement noise (39-node graph, Gaussian random-walk). GLACIAL outperforms baselines in most settings (see Appendix \ref{['app:more_comparisons']} for more comparisons).
  • Figure 5: Average F1-scores at various levels of missing at random. Lag-time=5. Noise level = 0.1. GLACIAL usually outperforms baselines. Running GLACIAL for more repetitions (i.e. 30 instead of 4, denoted as GLACIAL 30; see Section \ref{['ssec:imp_details']}) can improve performance when dealing with missing data.
  • ...and 12 more figures