Table of Contents
Fetching ...

On the Limitations and Prospects of Machine Unlearning for Generative AI

Shiji Zhou, Lianzhe Wang, Jiangnan Ye, Yongliang Wu, Heng Chang

TL;DR

This paper addresses the safety and privacy concerns of Generative AI by examining machine unlearning as a potential remedy. It formalizes unlearning, surveys LLM and image diffusion approaches, and systematically outlines limitations in efficacy, side effects, and measurement. The authors propose three forward-looking directions: standardized benchmarks, holistic metrics, and strategies to balance utility with forgetting, with emphasis on scalability, generality, and safety. By framing a roadmap for robust, interpretable, and scalable unlearning, the work aims to guide practical development of GenAI that respects privacy, reduces misuse, and maintains reliable performance.

Abstract

Generative AI (GenAI), which aims to synthesize realistic and diverse data samples from latent variables or other data modalities, has achieved remarkable results in various domains, such as natural language, images, audio, and graphs. However, they also pose challenges and risks to data privacy, security, and ethics. Machine unlearning is the process of removing or weakening the influence of specific data samples or features from a trained model, without affecting its performance on other data or tasks. While machine unlearning has shown significant efficacy in traditional machine learning tasks, it is still unclear if it could help GenAI become safer and aligned with human desire. To this end, this position paper provides an in-depth discussion of the machine unlearning approaches for GenAI. Firstly, we formulate the problem of machine unlearning tasks on GenAI and introduce the background. Subsequently, we systematically examine the limitations of machine unlearning on GenAI models by focusing on the two representative branches: LLMs and image generative (diffusion) models. Finally, we provide our prospects mainly from three aspects: benchmark, evaluation metrics, and utility-unlearning trade-off, and conscientiously advocate for the future development of this field.

On the Limitations and Prospects of Machine Unlearning for Generative AI

TL;DR

This paper addresses the safety and privacy concerns of Generative AI by examining machine unlearning as a potential remedy. It formalizes unlearning, surveys LLM and image diffusion approaches, and systematically outlines limitations in efficacy, side effects, and measurement. The authors propose three forward-looking directions: standardized benchmarks, holistic metrics, and strategies to balance utility with forgetting, with emphasis on scalability, generality, and safety. By framing a roadmap for robust, interpretable, and scalable unlearning, the work aims to guide practical development of GenAI that respects privacy, reduces misuse, and maintains reliable performance.

Abstract

Generative AI (GenAI), which aims to synthesize realistic and diverse data samples from latent variables or other data modalities, has achieved remarkable results in various domains, such as natural language, images, audio, and graphs. However, they also pose challenges and risks to data privacy, security, and ethics. Machine unlearning is the process of removing or weakening the influence of specific data samples or features from a trained model, without affecting its performance on other data or tasks. While machine unlearning has shown significant efficacy in traditional machine learning tasks, it is still unclear if it could help GenAI become safer and aligned with human desire. To this end, this position paper provides an in-depth discussion of the machine unlearning approaches for GenAI. Firstly, we formulate the problem of machine unlearning tasks on GenAI and introduce the background. Subsequently, we systematically examine the limitations of machine unlearning on GenAI models by focusing on the two representative branches: LLMs and image generative (diffusion) models. Finally, we provide our prospects mainly from three aspects: benchmark, evaluation metrics, and utility-unlearning trade-off, and conscientiously advocate for the future development of this field.
Paper Structure (53 sections, 2 equations, 2 figures)

This paper contains 53 sections, 2 equations, 2 figures.

Figures (2)

  • Figure 1: Summarization of our position on the limitations and prospects of machine unlearning methods on GenAI.
  • Figure 2: Example of comparison of baseline vs. unlearned LLM, which depicts the contradiction between forgetfulness and false memory. Results are adopted from eldan2023s.