Knowledge Distillation for Continual Learning of Biomedical Neural Fields
Wouter Visser, Jelmer M. Wolterink
TL;DR
The paper tackles catastrophic forgetting in neural fields when data arrive incrementally in biomedical imaging. It analyzes how different neural-field architectures and spectral-bias strategies influence forgetting and introduces a memory-free knowledge distillation method with $\mathcal{L}_{total}=\mathcal{L}_{fit}+\lambda\mathcal{L}_{distil}$ that preserves prior outputs without re-accessing old data. Through cardiac cine MRI experiments, it shows distillation improves both image reconstruction and segmentation stability across PE, SIREN, FINER, and DINER, with DINER often delivering the strongest reconstruction performance. The results support distillation as a practical approach to enabling continual learning in biomedical neural fields, while highlighting model- and hyperparameter-sensitive dynamics and pointing to future theoretical work on NTK interactions and alternative continual-learning strategies.
Abstract
Neural fields are increasingly used as a light-weight, continuous, and differentiable signal representation in (bio)medical imaging. However, unlike discrete signal representations such as voxel grids, neural fields cannot be easily extended. As neural fields are, in essence, neural networks, prior signals represented in a neural field will degrade when the model is presented with new data due to catastrophic forgetting. This work examines the extent to which different neural field approaches suffer from catastrophic forgetting and proposes a strategy to mitigate this issue. We consider the scenario in which data becomes available incrementally, with only the most recent data available for neural field fitting. In a series of experiments on cardiac cine MRI data, we demonstrate how knowledge distillation mitigates catastrophic forgetting when the spatiotemporal domain is enlarged or the dimensionality of the represented signal is increased. We find that the amount of catastrophic forgetting depends, to a large extent, on the neural fields model used, and that distillation could enable continual learning in neural fields.
