Table of Contents
Fetching ...

MMUnlearner: Reformulating Multimodal Machine Unlearning in the Era of Multimodal Large Language Models

Jiahao Huo, Yibo Yan, Xu Zheng, Yuanhuiyi Lyu, Xin Zou, Zhihua Wei, Xuming Hu

TL;DR

This work addresses the challenge of forgetting targeted visual knowledge in Multimodal Large Language Models (MLLMs) without erasing textual knowledge. It reformulates multimodal MU for MLLMs and introduces MMUnlearner, a geometry-constrained gradient ascent method guided by a weight-saliency map to selectively update parameters. Experiments on two MLLMs and two benchmarks show MMUnlearner outperforms LLM-based MU baselines in erasing visual concepts while preserving textual knowledge, highlighting its effectiveness and efficiency. The approach advances safe and privacy-conscious deployment of multimodal AI by enabling targeted forgetting without sacrificing core multimodal reasoning capabilities.

Abstract

Recent progress in Machine Unlearning (MU) has introduced solutions for the selective removal of private or sensitive information encoded within deep neural networks. Nonetheless, MU for Multimodal Large Language Models (MLLMs) remains in its nascent phase. Therefore, we propose to reformulate the task of multimodal MU in the era of MLLMs, which aims to erase only the visual patterns associated with a given entity while preserving the corresponding textual knowledge encoded within the original parameters of the language model backbone. Furthermore, we develop a novel geometry-constrained gradient ascent method MMUnlearner. It updates the weights of MLLMs with a weight saliency map jointly restricted by the remaining concepts and textual knowledge during unlearning, thereby preserving parameters essential for non-target knowledge. Extensive experiments demonstrate that MMUnlearner surpasses baselines that finetuning MLLMs with VQA data directly through Gradient Ascent (GA) or Negative Preference Optimization (NPO), across all evaluation dimensions. Our code can be found in [this URL](https://github.com/Z1zs/MMUnlearner).

MMUnlearner: Reformulating Multimodal Machine Unlearning in the Era of Multimodal Large Language Models

TL;DR

This work addresses the challenge of forgetting targeted visual knowledge in Multimodal Large Language Models (MLLMs) without erasing textual knowledge. It reformulates multimodal MU for MLLMs and introduces MMUnlearner, a geometry-constrained gradient ascent method guided by a weight-saliency map to selectively update parameters. Experiments on two MLLMs and two benchmarks show MMUnlearner outperforms LLM-based MU baselines in erasing visual concepts while preserving textual knowledge, highlighting its effectiveness and efficiency. The approach advances safe and privacy-conscious deployment of multimodal AI by enabling targeted forgetting without sacrificing core multimodal reasoning capabilities.

Abstract

Recent progress in Machine Unlearning (MU) has introduced solutions for the selective removal of private or sensitive information encoded within deep neural networks. Nonetheless, MU for Multimodal Large Language Models (MLLMs) remains in its nascent phase. Therefore, we propose to reformulate the task of multimodal MU in the era of MLLMs, which aims to erase only the visual patterns associated with a given entity while preserving the corresponding textual knowledge encoded within the original parameters of the language model backbone. Furthermore, we develop a novel geometry-constrained gradient ascent method MMUnlearner. It updates the weights of MLLMs with a weight saliency map jointly restricted by the remaining concepts and textual knowledge during unlearning, thereby preserving parameters essential for non-target knowledge. Extensive experiments demonstrate that MMUnlearner surpasses baselines that finetuning MLLMs with VQA data directly through Gradient Ascent (GA) or Negative Preference Optimization (NPO), across all evaluation dimensions. Our code can be found in [this URL](https://github.com/Z1zs/MMUnlearner).

Paper Structure

This paper contains 45 sections, 19 equations, 8 figures, 8 tables.

Figures (8)

  • Figure 1: Comparison between the previous setting (a) and our proposed one (b) for multimodal machine unlearning.
  • Figure 2: The framework of our reformulated Multimodal Machine Unlearning. Different from LLM-based unlearning setting, it emphasizes the accurate removal of specific vision patterns of targeted concepts and the preservation of textual knowledge.
  • Figure 3: An illustration of our proposed MMUnlearner. Compared to traditional approaches employed in previous work, which directly apply LLM-based unlearning algorithms to vanilla MLLMs, our method demonstrates superior parameter efficiency, forgetting performance, and textual knowledge preservation. Both the baseline and our approach are trained on VQA-format data, while textual QA-format data is used to assess the preservation of textual knowledge during evaluation.
  • Figure 4: The overall trade-off between unlearning effectiveness and model utility across five dimensions under varying forget ratios, using LLaVA as the base model. The $x$-axis represents the change in forget classification accuracy relative to the vanilla model, while the $y$-axis captures model utility from multiple perspectives. From left to right, these perspectives encompass Retain VQA, Real-world VQA, Forget QA, Retain QA, and Real-world QA performance.
  • Figure 5: The distribution of the top-$n$ deviated parameters across different MU algorithms for LLaVA, where $n$ corresponds to the number of unmasked parameters in Eq. \ref{['eq:mask']}. The $x$-axis represents different model layers while the $y$-axis denotes the layer index. Color reflects density of updated parameters, with darker colors for higher percentage of updates.
  • ...and 3 more figures