Table of Contents
Fetching ...

Few-Shot Continual Learning for 3D Brain MRI with Frozen Foundation Models

Chi-Sheng Chen, Xinyu Zhang, Guan-Ying Chen, Qiuzhe Xie, Fan Zhang, En-Jui Kuo

TL;DR

Frozen foundation models with task-specific LoRA adapters offer a practical solution when both tasks must be maintained under few-shot continual learning.

Abstract

Foundation models pretrained on large-scale 3D medical imaging data face challenges when adapted to multiple downstream tasks under continual learning with limited labeled data. We address few-shot continual learning for 3D brain MRI by combining a frozen pretrained backbone with task-specific Low-Rank Adaptation (LoRA) modules. Tasks arrive sequentially -- tumor segmentation (BraTS) and brain age estimation (IXI) -- with no replay of previous task data. Each task receives a dedicated LoRA adapter; only the adapter and task-specific head are trained while the backbone remains frozen, thereby eliminating catastrophic forgetting by design (BWT=0). In continual learning, sequential full fine-tuning suffers severe forgetting (T1 Dice drops from 0.80 to 0.16 after T2), while sequential linear probing achieves strong T1 (Dice 0.79) but fails on T2 (MAE 1.45). Our LoRA approach achieves the best balanced performance across both tasks: T1 Dice 0.62$\pm$0.07, T2 MAE 0.16$\pm$0.05, with zero forgetting and $<$0.1\% trainable parameters per task, though with noted systematic age underestimation in T2 (Wilcoxon $p<0.001$). Frozen foundation models with task-specific LoRA adapters thus offer a practical solution when both tasks must be maintained under few-shot continual learning.

Few-Shot Continual Learning for 3D Brain MRI with Frozen Foundation Models

TL;DR

Frozen foundation models with task-specific LoRA adapters offer a practical solution when both tasks must be maintained under few-shot continual learning.

Abstract

Foundation models pretrained on large-scale 3D medical imaging data face challenges when adapted to multiple downstream tasks under continual learning with limited labeled data. We address few-shot continual learning for 3D brain MRI by combining a frozen pretrained backbone with task-specific Low-Rank Adaptation (LoRA) modules. Tasks arrive sequentially -- tumor segmentation (BraTS) and brain age estimation (IXI) -- with no replay of previous task data. Each task receives a dedicated LoRA adapter; only the adapter and task-specific head are trained while the backbone remains frozen, thereby eliminating catastrophic forgetting by design (BWT=0). In continual learning, sequential full fine-tuning suffers severe forgetting (T1 Dice drops from 0.80 to 0.16 after T2), while sequential linear probing achieves strong T1 (Dice 0.79) but fails on T2 (MAE 1.45). Our LoRA approach achieves the best balanced performance across both tasks: T1 Dice 0.620.07, T2 MAE 0.160.05, with zero forgetting and 0.1\% trainable parameters per task, though with noted systematic age underestimation in T2 (Wilcoxon ). Frozen foundation models with task-specific LoRA adapters thus offer a practical solution when both tasks must be maintained under few-shot continual learning.
Paper Structure (28 sections, 1 equation, 7 figures, 8 tables)

This paper contains 28 sections, 1 equation, 7 figures, 8 tables.

Figures (7)

  • Figure 1: Framework: frozen pretrained backbone $f_\theta$ with task-specific LoRA adapters $\phi_k$ and heads $h_k$. Gray = frozen; blue = trainable per task. At inference for T1 (resp. T2): backbone + $\phi_1$ + $h_1$ (resp. $\phi_2$ + $h_2$).
  • Figure 2: Task 1 tumor segmentation (BraTS): representative sample. Input (T2-FLAIR), Ground Truth, LoRA prediction, overlay (red=GT, green=pred). Full six-sample figure in Appendix (Fig. \ref{['fig:app_seg_stack']}).
  • Figure 3: Task 2 brain age regression (IXI): representative sample. Orthogonal MRI views (Axial, Coronal, Sagittal) and predicted vs. actual age. Full six-sample figure in Appendix (Fig. \ref{['fig:app_age_stack']}).
  • Figure 4: Task 1 tumor segmentation (BraTS): full six-sample stack. Input (T2-FLAIR), Ground Truth, LoRA prediction, overlay (red=GT, green=pred). Samples selected by highest slice-level Dice.
  • Figure 5: Task 2 brain age regression (IXI): full six-sample stack. Orthogonal MRI views and regression results. Samples selected by lowest prediction error.
  • ...and 2 more figures