Table of Contents
Fetching ...

MemControl: Mitigating Memorization in Diffusion Models via Automated Parameter Selection

Raman Dutt, Ondrej Bohdal, Pedro Sanchez, Sotirios A. Tsaftaris, Timothy Hospedales

TL;DR

This work tackles memorization in diffusion-based medical image generation by treating model capacity as a controllable resource. It introduces MemControl, a bi-level optimization framework that automatically selects a sparse PEFT parameter mask to minimize memorization while preserving generation quality. On the MIMIC chest X-ray dataset, MemControl achieves a superior trade-off, with extremely small fine-tuning footprints (as low as $0.019\%$ of parameters) and strong transferability to non-medical datasets, outperforming state-of-the-art memorization mitigations and standard PEFT methods. The approach is scalable, reward-agnostic, and complementary to existing techniques, offering a practical, universal strategy for privacy-preserving diffusion-based generation in sensitive domains. The public code enhances reproducibility and potential adoption across diverse tasks.

Abstract

Diffusion models excel in generating images that closely resemble their training data but are also susceptible to data memorization, raising privacy, ethical, and legal concerns, particularly in sensitive domains such as medical imaging. We hypothesize that this memorization stems from the overparameterization of deep models and propose that regularizing model capacity during fine-tuning can mitigate this issue. Firstly, we empirically show that regulating the model capacity via Parameter-efficient fine-tuning (PEFT) mitigates memorization to some extent, however, it further requires the identification of the exact parameter subsets to be fine-tuned for high-quality generation. To identify these subsets, we introduce a bi-level optimization framework, MemControl, that automates parameter selection using memorization and generation quality metrics as rewards during fine-tuning. The parameter subsets discovered through MemControl achieve a superior tradeoff between generation quality and memorization. For the task of medical image generation, our approach outperforms existing state-of-the-art memorization mitigation strategies by fine-tuning as few as 0.019% of model parameters. Moreover, we demonstrate that the discovered parameter subsets are transferable to non-medical domains. Our framework is scalable to large datasets, agnostic to reward functions, and can be integrated with existing approaches for further memorization mitigation. To the best of our knowledge, this is the first study to empirically evaluate memorization in medical images and propose a targeted yet universal mitigation strategy. The code is available at https://github.com/Raman1121/Diffusion_Memorization_HPO.

MemControl: Mitigating Memorization in Diffusion Models via Automated Parameter Selection

TL;DR

This work tackles memorization in diffusion-based medical image generation by treating model capacity as a controllable resource. It introduces MemControl, a bi-level optimization framework that automatically selects a sparse PEFT parameter mask to minimize memorization while preserving generation quality. On the MIMIC chest X-ray dataset, MemControl achieves a superior trade-off, with extremely small fine-tuning footprints (as low as of parameters) and strong transferability to non-medical datasets, outperforming state-of-the-art memorization mitigations and standard PEFT methods. The approach is scalable, reward-agnostic, and complementary to existing techniques, offering a practical, universal strategy for privacy-preserving diffusion-based generation in sensitive domains. The public code enhances reproducibility and potential adoption across diverse tasks.

Abstract

Diffusion models excel in generating images that closely resemble their training data but are also susceptible to data memorization, raising privacy, ethical, and legal concerns, particularly in sensitive domains such as medical imaging. We hypothesize that this memorization stems from the overparameterization of deep models and propose that regularizing model capacity during fine-tuning can mitigate this issue. Firstly, we empirically show that regulating the model capacity via Parameter-efficient fine-tuning (PEFT) mitigates memorization to some extent, however, it further requires the identification of the exact parameter subsets to be fine-tuned for high-quality generation. To identify these subsets, we introduce a bi-level optimization framework, MemControl, that automates parameter selection using memorization and generation quality metrics as rewards during fine-tuning. The parameter subsets discovered through MemControl achieve a superior tradeoff between generation quality and memorization. For the task of medical image generation, our approach outperforms existing state-of-the-art memorization mitigation strategies by fine-tuning as few as 0.019% of model parameters. Moreover, we demonstrate that the discovered parameter subsets are transferable to non-medical domains. Our framework is scalable to large datasets, agnostic to reward functions, and can be integrated with existing approaches for further memorization mitigation. To the best of our knowledge, this is the first study to empirically evaluate memorization in medical images and propose a targeted yet universal mitigation strategy. The code is available at https://github.com/Raman1121/Diffusion_Memorization_HPO.
Paper Structure (17 sections, 8 equations, 4 figures, 3 tables, 1 algorithm)

This paper contains 17 sections, 8 equations, 4 figures, 3 tables, 1 algorithm.

Figures (4)

  • Figure 1: Conventional full fine-tuning results in the generation of nearly identical images across different seeds for the same text prompt. Memorization is evident through the replication of artefacts (red squares), which notably occurs with high precision and is consistently observed in the generated images. This replication can lead to patient information leakage and potential re-identification. Column 1 displays the original training images, while columns 2-5 show the closest generated samples across different seeds.
  • Figure 2: Overall schematic of our framework. Stage 1: PEFT Mask Search (top): We use a subset of the training set to search for the mask that decides which PEFT components of a pre-trained model $\theta$ should be fine-tuned to optimise both generation quality ($d^{fid}$) and memorization ($d^{mem}$). Stage 2: Fine-tune with mask (bottom): The optimal mask from the HPO search is used for fine-tuning on full dataset and final results are reported on the test set ($\mathcal{D}^{test}$).
  • Figure 3: Plot illustrating how model capacity affects the memorization vs. generation quality tradeoff. The HPO search explored various combinations of parameter subsets during fine-tuning (blue markers). Each combination results in a different model capacity, generation quality ($d^{fid} \downarrow$) and memorization ($d^{mem}, \downarrow$). The performance of these subsets is compared to two default PEFT configurations (orange squares) and full fine-tuning (green square). The pareto front of optimal parameter subsets, indicated by the lowest $d^{fid}$ and $d^{mem}$, are marked in red, while the final optimal solution is marked in cyan.
  • Figure 4: Plot illustrating a qualitative comparison between full fine-tuning (Full FT), Full FT + RWA, and MemControl. Full FT (col. 1 & 2) generates images that are near-replicas of the original training image. Combining Full FT with RWA mitigation strategy (col. 3 & 4) diversifies the generated images but deteriorates the quality. MemControl (col. 5 & 6) preserves image quality and prevents generating identical images.