Table of Contents
Fetching ...

Unified Parameter-Efficient Unlearning for LLMs

Chenlu Ding, Jiancan Wu, Yancheng Yuan, Jinda Lu, Kai Zhang, Alex Su, Xiang Wang, Xiangnan He

TL;DR

LLMEraser targets privacy-driven unlearning in PEFT-enhanced LLMs by leveraging influence functions to compute direct adapter-parameter updates, avoiding costly retraining. It recasts the parameter-change calculation as a finite-sum convex quadratic problem solved via SGD with Hessian-vector products, enabling scalable, exact-like updates for instance-wise unlearning tasks. The approach demonstrates near-retraining performance across instance removal, query modification, and response correction while delivering substantial efficiency gains on both LLMs and multimodal LLMs. This work advances safe, efficient data forgetting in domain-adapted LLMs with broad applicability to privacy-preserving AI systems.

Abstract

The advent of Large Language Models (LLMs) has revolutionized natural language processing, enabling advanced understanding and reasoning capabilities across a variety of tasks. Fine-tuning these models for specific domains, particularly through Parameter-Efficient Fine-Tuning (PEFT) strategies like LoRA, has become a prevalent practice due to its efficiency. However, this raises significant privacy and security concerns, as models may inadvertently retain and disseminate sensitive or undesirable information. To address these issues, we introduce a novel instance-wise unlearning framework, LLMEraser, which systematically categorizes unlearning tasks and applies precise parameter adjustments using influence functions. Unlike traditional unlearning techniques that are often limited in scope and require extensive retraining, LLMEraser is designed to handle a broad spectrum of unlearning tasks without compromising model performance. Extensive experiments on benchmark datasets demonstrate that LLMEraser excels in efficiently managing various unlearning scenarios while maintaining the overall integrity and efficacy of the models.

Unified Parameter-Efficient Unlearning for LLMs

TL;DR

LLMEraser targets privacy-driven unlearning in PEFT-enhanced LLMs by leveraging influence functions to compute direct adapter-parameter updates, avoiding costly retraining. It recasts the parameter-change calculation as a finite-sum convex quadratic problem solved via SGD with Hessian-vector products, enabling scalable, exact-like updates for instance-wise unlearning tasks. The approach demonstrates near-retraining performance across instance removal, query modification, and response correction while delivering substantial efficiency gains on both LLMs and multimodal LLMs. This work advances safe, efficient data forgetting in domain-adapted LLMs with broad applicability to privacy-preserving AI systems.

Abstract

The advent of Large Language Models (LLMs) has revolutionized natural language processing, enabling advanced understanding and reasoning capabilities across a variety of tasks. Fine-tuning these models for specific domains, particularly through Parameter-Efficient Fine-Tuning (PEFT) strategies like LoRA, has become a prevalent practice due to its efficiency. However, this raises significant privacy and security concerns, as models may inadvertently retain and disseminate sensitive or undesirable information. To address these issues, we introduce a novel instance-wise unlearning framework, LLMEraser, which systematically categorizes unlearning tasks and applies precise parameter adjustments using influence functions. Unlike traditional unlearning techniques that are often limited in scope and require extensive retraining, LLMEraser is designed to handle a broad spectrum of unlearning tasks without compromising model performance. Extensive experiments on benchmark datasets demonstrate that LLMEraser excels in efficiently managing various unlearning scenarios while maintaining the overall integrity and efficacy of the models.

Paper Structure

This paper contains 31 sections, 20 equations, 5 figures, 9 tables, 1 algorithm.

Figures (5)

  • Figure 1: \ref{['f1']}: A brief description of the different types of LLM unlearning tasks. \ref{['f2']}: The framework of exact LLM unlearning method, approximate unlearning method.
  • Figure 2: The framework of LLMEraser. The old adapter is obtained through PEFT on domain-specific data. When an unlearning request arrives (e.g., deleting or correcting certain data from the training set), LLMEraser utilizes influence functions to compute the parameter changes caused by such request. These estimated parameter modifications are added to the old adapter's weights, resulting in the new adapter parameters—essentially the unlearned model parameters.
  • Figure 3: \ref{['ratio2']}: Experimental results of the instance removal task using TallRec as the LLM4Rec model on the BookCrossing dataset, where 5% and 10% of the training data were randomly deleted. \ref{['ratio1']}: Experimental results of the query modification task using LLaRA as the LLM4Rec model on the MovieLens dataset, where interactions were randomly removed from 5% and 10% of users.
  • Figure 4: Instance Removal Case Study & Query Modification Case Study.
  • Figure 5: Response Correction Case Study.