Table of Contents
Fetching ...

SALMUBench: A Benchmark for Sensitive Association-Level Multimodal Unlearning

Cai Selvas-Sala, Lei Kang, Lluis Gomez

Abstract

As multimodal models like CLIP become integral to downstream systems, the need to remove sensitive information is critical. However, machine unlearning for contrastively-trained encoders remains underexplored, and existing evaluations fail to diagnose fine-grained, association-level forgetting. We introduce SALMUBench (Sensitive Association-Level Multimodal Unlearning), a benchmark built upon a synthetic dataset of 60K persona-attribute associations and two foundational models: a Compromised model polluted with this data, and a Clean model without it. To isolate unlearning effects, both are trained from scratch on the same 400M-pair retain base, with the Compromised model additionally trained on the sensitive set. We propose a novel evaluation protocol with structured holdout sets (holdout identity, holdout association) to precisely measure unlearning efficacy and collateral damage. Our benchmark reveals that while utility-efficient deletion is feasible, current methods exhibit distinct failure modes: they either fail to forget effectively or over-generalize by erasing more than intended. SALMUBench sets a new standard for comprehensive unlearning evaluation, and we publicly release our dataset, models, evaluation scripts, and leaderboards to foster future research.

SALMUBench: A Benchmark for Sensitive Association-Level Multimodal Unlearning

Abstract

As multimodal models like CLIP become integral to downstream systems, the need to remove sensitive information is critical. However, machine unlearning for contrastively-trained encoders remains underexplored, and existing evaluations fail to diagnose fine-grained, association-level forgetting. We introduce SALMUBench (Sensitive Association-Level Multimodal Unlearning), a benchmark built upon a synthetic dataset of 60K persona-attribute associations and two foundational models: a Compromised model polluted with this data, and a Clean model without it. To isolate unlearning effects, both are trained from scratch on the same 400M-pair retain base, with the Compromised model additionally trained on the sensitive set. We propose a novel evaluation protocol with structured holdout sets (holdout identity, holdout association) to precisely measure unlearning efficacy and collateral damage. Our benchmark reveals that while utility-efficient deletion is feasible, current methods exhibit distinct failure modes: they either fail to forget effectively or over-generalize by erasing more than intended. SALMUBench sets a new standard for comprehensive unlearning evaluation, and we publicly release our dataset, models, evaluation scripts, and leaderboards to foster future research.

Paper Structure

This paper contains 58 sections, 8 figures, 6 tables.

Figures (8)

  • Figure 1: CLIP models can memorize and leak private information. CLIP-based systems can associate a face with sensitive attributes (e.g., a phone number) seen during training.
  • Figure 2: The construction pipeline of the SALMU dataset follows five sequential stages.
  • Figure 3: Example of a fictitious persona from the SALMU dataset. We show the reference identity anchor image from the SFHQ dataset (top left), her sensitive attributes (bottom left), and a subset of generated images and captions (top right and bottom right respectively) that constitute the image-text pairs' associations for this persona in the SALMU dataset.
  • Figure 4: Splits and subsets of SALMUBench along with the number of image-text pairs that each of them contains.
  • Figure 5: Cosine similarity between image-text pairs in the sensitive set for the Clean and Compromised models.
  • ...and 3 more figures