Table of Contents
Fetching ...

Knowledge Distillation for Continual Learning of Biomedical Neural Fields

Wouter Visser, Jelmer M. Wolterink

TL;DR

The paper tackles catastrophic forgetting in neural fields when data arrive incrementally in biomedical imaging. It analyzes how different neural-field architectures and spectral-bias strategies influence forgetting and introduces a memory-free knowledge distillation method with $\mathcal{L}_{total}=\mathcal{L}_{fit}+\lambda\mathcal{L}_{distil}$ that preserves prior outputs without re-accessing old data. Through cardiac cine MRI experiments, it shows distillation improves both image reconstruction and segmentation stability across PE, SIREN, FINER, and DINER, with DINER often delivering the strongest reconstruction performance. The results support distillation as a practical approach to enabling continual learning in biomedical neural fields, while highlighting model- and hyperparameter-sensitive dynamics and pointing to future theoretical work on NTK interactions and alternative continual-learning strategies.

Abstract

Neural fields are increasingly used as a light-weight, continuous, and differentiable signal representation in (bio)medical imaging. However, unlike discrete signal representations such as voxel grids, neural fields cannot be easily extended. As neural fields are, in essence, neural networks, prior signals represented in a neural field will degrade when the model is presented with new data due to catastrophic forgetting. This work examines the extent to which different neural field approaches suffer from catastrophic forgetting and proposes a strategy to mitigate this issue. We consider the scenario in which data becomes available incrementally, with only the most recent data available for neural field fitting. In a series of experiments on cardiac cine MRI data, we demonstrate how knowledge distillation mitigates catastrophic forgetting when the spatiotemporal domain is enlarged or the dimensionality of the represented signal is increased. We find that the amount of catastrophic forgetting depends, to a large extent, on the neural fields model used, and that distillation could enable continual learning in neural fields.

Knowledge Distillation for Continual Learning of Biomedical Neural Fields

TL;DR

The paper tackles catastrophic forgetting in neural fields when data arrive incrementally in biomedical imaging. It analyzes how different neural-field architectures and spectral-bias strategies influence forgetting and introduces a memory-free knowledge distillation method with that preserves prior outputs without re-accessing old data. Through cardiac cine MRI experiments, it shows distillation improves both image reconstruction and segmentation stability across PE, SIREN, FINER, and DINER, with DINER often delivering the strongest reconstruction performance. The results support distillation as a practical approach to enabling continual learning in biomedical neural fields, while highlighting model- and hyperparameter-sensitive dynamics and pointing to future theoretical work on NTK interactions and alternative continual-learning strategies.

Abstract

Neural fields are increasingly used as a light-weight, continuous, and differentiable signal representation in (bio)medical imaging. However, unlike discrete signal representations such as voxel grids, neural fields cannot be easily extended. As neural fields are, in essence, neural networks, prior signals represented in a neural field will degrade when the model is presented with new data due to catastrophic forgetting. This work examines the extent to which different neural field approaches suffer from catastrophic forgetting and proposes a strategy to mitigate this issue. We consider the scenario in which data becomes available incrementally, with only the most recent data available for neural field fitting. In a series of experiments on cardiac cine MRI data, we demonstrate how knowledge distillation mitigates catastrophic forgetting when the spatiotemporal domain is enlarged or the dimensionality of the represented signal is increased. We find that the amount of catastrophic forgetting depends, to a large extent, on the neural fields model used, and that distillation could enable continual learning in neural fields.

Paper Structure

This paper contains 13 sections, 2 equations, 6 figures.

Figures (6)

  • Figure 1: Schematic representation of knowledge distillation. $\theta_{1:s}$ denotes the parameters of the current model, while $\theta_{1:s-1}$ denotes the parameters of the previously trained model.
  • Figure 2: We consider domain expansion and signal expansion. Domain expansion affects the input values' range of one or more input coordinates, in this case, time $t$. Signal expansion affects the output values. In this case, we first fit an image, and then its corresponding segmentation mask.
  • Figure 3: Comparison of reconstruction quality between the first and last frame of the 3D+time dataset in the domain expansion setting.
  • Figure 4: Examples of reconstruction of the first frame of the 3D+t dataset for models trained without and with distillation.
  • Figure 5: Comparison of reconstruction and segmentation performance of training types in the signal expansion setting.
  • ...and 1 more figures