Table of Contents
Fetching ...

Erase to Enhance: Data-Efficient Machine Unlearning in MRI Reconstruction

Yuyang Xue, Jingshuai Liu, Steven McDonagh, Sotirios A. Tsaftaris

TL;DR

This work addresses privacy and bias risks in MRI reconstruction when training on mixed-anatomy data by formalizing machine unlearning for image-to-image tasks. It defines a protocol with an oracle model trained on retain data and an original model trained on retain plus forget data to study how to remove the forget influence using data-efficient unlearning methods. The authors adapt reconstruction-focused unlearning strategies (Fine-tuning, Gradient Ascent, and Noisy Labelling) and show that data-efficient variants like NL-FT can approach the oracle's performance on retain data while degrading performance on forgotten data, without full retraining. The study demonstrates that unlearning can mitigate artefacts and hallucinations arising from cross-domain training and suggests practical pathways for bias removal in clinical MRI, with code made publicly available for reproducibility.

Abstract

Machine unlearning is a promising paradigm for removing unwanted data samples from a trained model, towards ensuring compliance with privacy regulations and limiting harmful biases. Although unlearning has been shown in, e.g., classification and recommendation systems, its potential in medical image-to-image translation, specifically in image recon-struction, has not been thoroughly investigated. This paper shows that machine unlearning is possible in MRI tasks and has the potential to benefit for bias removal. We set up a protocol to study how much shared knowledge exists between datasets of different organs, allowing us to effectively quantify the effect of unlearning. Our study reveals that combining training data can lead to hallucinations and reduced image quality in the reconstructed data. We use unlearning to remove hallucinations as a proxy exemplar of undesired data removal. Indeed, we show that machine unlearning is possible without full retraining. Furthermore, our observations indicate that maintaining high performance is feasible even when using only a subset of retain data. We have made our code publicly accessible.

Erase to Enhance: Data-Efficient Machine Unlearning in MRI Reconstruction

TL;DR

This work addresses privacy and bias risks in MRI reconstruction when training on mixed-anatomy data by formalizing machine unlearning for image-to-image tasks. It defines a protocol with an oracle model trained on retain data and an original model trained on retain plus forget data to study how to remove the forget influence using data-efficient unlearning methods. The authors adapt reconstruction-focused unlearning strategies (Fine-tuning, Gradient Ascent, and Noisy Labelling) and show that data-efficient variants like NL-FT can approach the oracle's performance on retain data while degrading performance on forgotten data, without full retraining. The study demonstrates that unlearning can mitigate artefacts and hallucinations arising from cross-domain training and suggests practical pathways for bias removal in clinical MRI, with code made publicly available for reproducibility.

Abstract

Machine unlearning is a promising paradigm for removing unwanted data samples from a trained model, towards ensuring compliance with privacy regulations and limiting harmful biases. Although unlearning has been shown in, e.g., classification and recommendation systems, its potential in medical image-to-image translation, specifically in image recon-struction, has not been thoroughly investigated. This paper shows that machine unlearning is possible in MRI tasks and has the potential to benefit for bias removal. We set up a protocol to study how much shared knowledge exists between datasets of different organs, allowing us to effectively quantify the effect of unlearning. Our study reveals that combining training data can lead to hallucinations and reduced image quality in the reconstructed data. We use unlearning to remove hallucinations as a proxy exemplar of undesired data removal. Indeed, we show that machine unlearning is possible without full retraining. Furthermore, our observations indicate that maintaining high performance is feasible even when using only a subset of retain data. We have made our code publicly accessible.
Paper Structure (13 sections, 3 equations, 5 figures, 3 tables)

This paper contains 13 sections, 3 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: The original model $G$ trained with combined brain and knee data shows hallucinations (red circles). The unlearned model $G^U$ can remove artifacts originating from such an anatomy shift, reducing the overall reconstruction error.
  • Figure 2: Machine unlearning in MRI reconstruction overview. The oracle model and the original model are trained on retain set and composite set (retain + forget), respectively. Taking advantage of the original model by employing an unlearning algorithm can quickly adapt to data removal requests instead of retraining.
  • Figure 3: Unlearning approaches vs. oracle model $\hat{G}$. Unlearning accuracy (UA), brain test accuracy (BTA), and knee test accuracy (KTA) are shown in PSNR. The reciprocal of run-time efficiency (RTE) is normalised to $[0, 1]$ for ease of visualisation. FT and NL-FT achieve the best, closest to oracle with the highest RTE.
  • Figure 4: The gradient ascent (GA-$\ell_1$-FT) and gradient ascent with fine-tuning (NL-FT).
  • Figure 5: The Pareto optimum to achieve high unlearning efficiency may be found in the fitting curve of the BTA to unlearning time, which is directly related to retain sample usage.