Table of Contents
Fetching ...

Machine Unlearning for Traditional Models and Large Language Models: A Short Survey

Yi Xu

TL;DR

The paper addresses the right-to-forget challenge in ML by surveying machine unlearning for traditional models and large language models (LLMs). It formalizes MU with a taxonomy that separates data-driven and model-based approaches for traditional models and splits LLM methods into parameter-tuning and parameter-agnostic categories, including in-context unlearning, while outlining comprehensive evaluation criteria across time, accuracy, similarity, attacks, and theory. It highlights exact versus approximate unlearning targets and reviews representative techniques such as SISA, ARCANE, data augmentation, gradient-based unlearning, and task-vector merging, emphasizing efficiency-utility trade-offs and the need for standardized benchmarks. The survey concludes with a roadmap of challenges, future directions, and the importance of rigorous, cross-domain evaluation to ensure private data removal without compromising model utility, particularly in evolving LLM ecosystems.

Abstract

With the implementation of personal data privacy regulations, the field of machine learning (ML) faces the challenge of the "right to be forgotten". Machine unlearning has emerged to address this issue, aiming to delete data and reduce its impact on models according to user requests. Despite the widespread interest in machine unlearning, comprehensive surveys on its latest advancements, especially in the field of Large Language Models (LLMs) is lacking. This survey aims to fill this gap by providing an in-depth exploration of machine unlearning, including the definition, classification and evaluation criteria, as well as challenges in different environments and their solutions. Specifically, this paper categorizes and investigates unlearning on both traditional models and LLMs, and proposes methods for evaluating the effectiveness and efficiency of unlearning, and standards for performance measurement. This paper reveals the limitations of current unlearning techniques and emphasizes the importance of a comprehensive unlearning evaluation to avoid arbitrary forgetting. This survey not only summarizes the key concepts of unlearning technology but also points out its prominent issues and feasible directions for future research, providing valuable guidance for scholars in the field.

Machine Unlearning for Traditional Models and Large Language Models: A Short Survey

TL;DR

The paper addresses the right-to-forget challenge in ML by surveying machine unlearning for traditional models and large language models (LLMs). It formalizes MU with a taxonomy that separates data-driven and model-based approaches for traditional models and splits LLM methods into parameter-tuning and parameter-agnostic categories, including in-context unlearning, while outlining comprehensive evaluation criteria across time, accuracy, similarity, attacks, and theory. It highlights exact versus approximate unlearning targets and reviews representative techniques such as SISA, ARCANE, data augmentation, gradient-based unlearning, and task-vector merging, emphasizing efficiency-utility trade-offs and the need for standardized benchmarks. The survey concludes with a roadmap of challenges, future directions, and the importance of rigorous, cross-domain evaluation to ensure private data removal without compromising model utility, particularly in evolving LLM ecosystems.

Abstract

With the implementation of personal data privacy regulations, the field of machine learning (ML) faces the challenge of the "right to be forgotten". Machine unlearning has emerged to address this issue, aiming to delete data and reduce its impact on models according to user requests. Despite the widespread interest in machine unlearning, comprehensive surveys on its latest advancements, especially in the field of Large Language Models (LLMs) is lacking. This survey aims to fill this gap by providing an in-depth exploration of machine unlearning, including the definition, classification and evaluation criteria, as well as challenges in different environments and their solutions. Specifically, this paper categorizes and investigates unlearning on both traditional models and LLMs, and proposes methods for evaluating the effectiveness and efficiency of unlearning, and standards for performance measurement. This paper reveals the limitations of current unlearning techniques and emphasizes the importance of a comprehensive unlearning evaluation to avoid arbitrary forgetting. This survey not only summarizes the key concepts of unlearning technology but also points out its prominent issues and feasible directions for future research, providing valuable guidance for scholars in the field.
Paper Structure (37 sections, 1 figure, 4 tables)