Table of Contents
Fetching ...

Generative Autoregressive Transformers for Model-Agnostic Federated MRI Reconstruction

Valiyeh A. Nezhad, Gokberk Elmas, Bilal Kabas, Fuat Arslan, Emine U. Saritas, Tolga Çukur

TL;DR

This work addresses the challenge of cross-site generalization in MRI reconstruction under privacy constraints by proposing FedGAT, a model-agnostic federated learning framework. It decouples collaborative learning into a federated GAT prior, built from a frozen VAE and a site-conditioned autoregressive transformer, and a second tier where each site trains a reconstruction model on a mix of local and synthetic data generated by the prior. Site prompts and multi-scale autoregression enable controlled, high-fidelity synthesis across sites, while augmentation with synthetic data improves generalization without sharing raw data. Experiments on multi-institutional datasets demonstrate that FedGAT surpasses state-of-the-art FL baselines in both within-site and cross-site reconstruction, validating its ability to support heterogeneous architectures and scalable collaboration in privacy-preserving MRI studies.

Abstract

While learning-based models hold great promise for MRI reconstruction, single-site models trained on limited local datasets often show poor generalization. This has motivated collaborative training across institutions via federated learning (FL)-a privacy-preserving framework that aggregates model updates instead of sharing raw data. Conventional FL requires architectural homogeneity, restricting sites from using models tailored to their resources or needs. To address this limitation, we propose FedGAT, a model-agnostic FL technique that first collaboratively trains a global generative prior for MR images, adapted from a natural image foundation model composed of a variational autoencoder (VAE) and a transformer that generates images via spatial-scale autoregression. We fine-tune the transformer module after injecting it with a lightweight site-specific prompting mechanism, keeping the VAE frozen, to efficiently adapt the model to multi-site MRI data. In a second tier, each site independently trains its preferred reconstruction model by augmenting local data with synthetic MRI data from other sites, generated by site-prompting the tuned prior. This decentralized augmentation improves generalization while preserving privacy. Experiments on multi-institutional datasets show that FedGAT outperforms state-of-the-art FL baselines in both within- and cross-site reconstruction performance under model-heterogeneous settings.

Generative Autoregressive Transformers for Model-Agnostic Federated MRI Reconstruction

TL;DR

This work addresses the challenge of cross-site generalization in MRI reconstruction under privacy constraints by proposing FedGAT, a model-agnostic federated learning framework. It decouples collaborative learning into a federated GAT prior, built from a frozen VAE and a site-conditioned autoregressive transformer, and a second tier where each site trains a reconstruction model on a mix of local and synthetic data generated by the prior. Site prompts and multi-scale autoregression enable controlled, high-fidelity synthesis across sites, while augmentation with synthetic data improves generalization without sharing raw data. Experiments on multi-institutional datasets demonstrate that FedGAT surpasses state-of-the-art FL baselines in both within-site and cross-site reconstruction, validating its ability to support heterogeneous architectures and scalable collaboration in privacy-preserving MRI studies.

Abstract

While learning-based models hold great promise for MRI reconstruction, single-site models trained on limited local datasets often show poor generalization. This has motivated collaborative training across institutions via federated learning (FL)-a privacy-preserving framework that aggregates model updates instead of sharing raw data. Conventional FL requires architectural homogeneity, restricting sites from using models tailored to their resources or needs. To address this limitation, we propose FedGAT, a model-agnostic FL technique that first collaboratively trains a global generative prior for MR images, adapted from a natural image foundation model composed of a variational autoencoder (VAE) and a transformer that generates images via spatial-scale autoregression. We fine-tune the transformer module after injecting it with a lightweight site-specific prompting mechanism, keeping the VAE frozen, to efficiently adapt the model to multi-site MRI data. In a second tier, each site independently trains its preferred reconstruction model by augmenting local data with synthetic MRI data from other sites, generated by site-prompting the tuned prior. This decentralized augmentation improves generalization while preserving privacy. Experiments on multi-institutional datasets show that FedGAT outperforms state-of-the-art FL baselines in both within- and cross-site reconstruction performance under model-heterogeneous settings.

Paper Structure

This paper contains 22 sections, 27 equations, 5 figures, 5 tables, 1 algorithm.

Figures (5)

  • Figure 1: FedGAT adopts a two-tier framework to enable collaborative training of heterogeneous MRI reconstruction models. (a) The first tier conducts decentralized training of a global prior ${\theta_{\text{GAT}}}$, a generative autoregressive transformer that models the distribution of multi-site MR images via autoregressive prediction across increasing spatial scales, guided by a site prompt $\texttt{sp}$ to retain site-specific attributes. (b) The second tier carries out local training of site-specific reconstruction models $H^k_{{\phi}^k}$$(k$: site index) on hybrid datasets that combine local MRI data with GAT-generated synthetic images emulating remaining sites, thereby promoting generalization while preserving data privacy.
  • Figure 2: Architecture of the proposed site-prompted GAT prior. (a) The GAT prior embodies a variational autoencoder (VAE), whose encoder maps an input MR image onto a set of discrete token maps $\mathbf{f}_1, \mathbf{f}_2, \dots, \mathbf{f}_S$ across $S$ spatial scales and the decoder reconstructs the image from these token maps. (b) The transformer module establishes an autoregressive prior over the multi-scale token maps, predicting each higher-scale map $\mathbf{f}_s$ conditioned on preceding maps $\mathbf{f}_{<s} := \{\mathbf{f}_1, ..., \mathbf{f}_{s-1}\}$. To retain site-specific features in synthetic MR images, a site prompt $\mathbf{sp}(k)$ is derived from a one-hot site index via a gated MLP (gMLP), and used to initialize the site token $\mathbf{st}$ in the transformer.
  • Figure 3: UMAP projection of deep feature embeddings from actual (left) and synthetic (GAT prior; right) MR images across three sites (fastMRI-knee, fastMRI-brain, UMRAM).
  • Figure 4: Representative reconstructions at R=8x from zero-filled Fourier method (Zero-filled), single-site models (Single), FL baselines (FedDF, FedMD, FedGIMP, FedDDA), and FedGAT, along with reference images. (a) Site-specific fastMRI-knee model tested on UMRAM, (b) Site-specific fastMRI-brain model tested on fastMRI-knee, (c) Site-specific UMRAM model tested on fastMRI-brain. Zoom-in windows of error maps and images are included to emphasize method differences.
  • Figure 5: Performance of site-specific models trained at a held-out site (Calgary) on hybrid datasets combining local and GAT-generated synthetic data from a three-site FL setup (fastMRI-knee, fastMRI-brain, UMRAM). With the proportion of local data systematically varied, results reflect within-site (Calgary) and and across-site reconstruction performance at the FL sites (see legend).