Table of Contents
Fetching ...

SOUL: Unlocking the Power of Second-Order Optimization for LLM Unlearning

Jinghan Jia, Yihua Zhang, Yimeng Zhang, Jiancheng Liu, Bharat Runwal, James Diffenderfer, Bhavya Kailkhura, Sijia Liu

TL;DR

This work identifies optimizer choice, particularly second-order methods, as a pivotal factor in LLM unlearning and introduces SOUL, a second-order, loss-agnostic framework built on the Sophia optimizer to iteratively unlearn targeted data. By linking influence unlearning with Newton-style updates and employing a diagonal Hessian approximation, SOUL improves forgetting efficacy while preserving utility across TOFU, copyright removal, and detoxification tasks. Empirical results show SOUL outperforms traditional first-order methods in forget quality, MIA robustness, and downstream accuracy, with reasonable time overhead and enhanced resilience to jailbreaking prompts. The approach provides a practical, broadly applicable path to more effective LLM unlearning and motivates further exploration of second-order optimization in data-removal contexts.

Abstract

Large Language Models (LLMs) have highlighted the necessity of effective unlearning mechanisms to comply with data regulations and ethical AI practices. LLM unlearning aims at removing undesired data influences and associated model capabilities without compromising utility beyond the scope of unlearning. While interest in studying LLM unlearning is growing, the impact of the optimizer choice for LLM unlearning remains unexplored. In this work, we shed light on the significance of optimizer selection in LLM unlearning for the first time, establishing a clear connection between second-order optimization and influence unlearning (a classical approach using influence functions to update the model for data influence removal). This insight propels us to develop a second-order optimization-based LLM unlearning framework, termed Second-Order UnLearning (SOUL), which extends the static, one-shot model update using influence unlearning to a dynamic, iterative unlearning process. Our extensive experiments show that SOUL consistently outperforms conventional first-order methods across various unlearning tasks, models, and metrics, indicating that second-order optimization offers an effective and broadly applicable solution for LLM unlearning. Codes are available at https://github.com/OPTML-Group/SOUL.

SOUL: Unlocking the Power of Second-Order Optimization for LLM Unlearning

TL;DR

This work identifies optimizer choice, particularly second-order methods, as a pivotal factor in LLM unlearning and introduces SOUL, a second-order, loss-agnostic framework built on the Sophia optimizer to iteratively unlearn targeted data. By linking influence unlearning with Newton-style updates and employing a diagonal Hessian approximation, SOUL improves forgetting efficacy while preserving utility across TOFU, copyright removal, and detoxification tasks. Empirical results show SOUL outperforms traditional first-order methods in forget quality, MIA robustness, and downstream accuracy, with reasonable time overhead and enhanced resilience to jailbreaking prompts. The approach provides a practical, broadly applicable path to more effective LLM unlearning and motivates further exploration of second-order optimization in data-removal contexts.

Abstract

Large Language Models (LLMs) have highlighted the necessity of effective unlearning mechanisms to comply with data regulations and ethical AI practices. LLM unlearning aims at removing undesired data influences and associated model capabilities without compromising utility beyond the scope of unlearning. While interest in studying LLM unlearning is growing, the impact of the optimizer choice for LLM unlearning remains unexplored. In this work, we shed light on the significance of optimizer selection in LLM unlearning for the first time, establishing a clear connection between second-order optimization and influence unlearning (a classical approach using influence functions to update the model for data influence removal). This insight propels us to develop a second-order optimization-based LLM unlearning framework, termed Second-Order UnLearning (SOUL), which extends the static, one-shot model update using influence unlearning to a dynamic, iterative unlearning process. Our extensive experiments show that SOUL consistently outperforms conventional first-order methods across various unlearning tasks, models, and metrics, indicating that second-order optimization offers an effective and broadly applicable solution for LLM unlearning. Codes are available at https://github.com/OPTML-Group/SOUL.
Paper Structure (36 sections, 10 equations, 2 figures, 14 tables, 1 algorithm)

This paper contains 36 sections, 10 equations, 2 figures, 14 tables, 1 algorithm.

Figures (2)

  • Figure 1: Performance highlight using SO optimization (SOUL) in the TOFU dataset maini2024tofu for fictitious unlearning. (Left) Examples of text outputs from LLMs post unlearning using various approaches, including FO GradDiff (gradient difference) liu2022continualmaini2024tofu and PO (preference optimization) maini2024tofueldan2023whos, as well as their SO counterparts. Failed unlearning is indicated by undesired answers marked in red, while successful unlearning is highlighted in green for desired answers. (Right) Quantitative evaluation comparing SO unlearning with FO unlearning using the metrics forget quality and model utility, as detailed in Sec. \ref{['sec: experiment']}.
  • Figure 2: Unlearning performance versus optimization epochs using different optimizers in TOFU unlearning. Left: forget accuracy vs. epochs; Right: retain accuracy vs. epochs.