SOUL: Unlocking the Power of Second-Order Optimization for LLM Unlearning
Jinghan Jia, Yihua Zhang, Yimeng Zhang, Jiancheng Liu, Bharat Runwal, James Diffenderfer, Bhavya Kailkhura, Sijia Liu
TL;DR
This work identifies optimizer choice, particularly second-order methods, as a pivotal factor in LLM unlearning and introduces SOUL, a second-order, loss-agnostic framework built on the Sophia optimizer to iteratively unlearn targeted data. By linking influence unlearning with Newton-style updates and employing a diagonal Hessian approximation, SOUL improves forgetting efficacy while preserving utility across TOFU, copyright removal, and detoxification tasks. Empirical results show SOUL outperforms traditional first-order methods in forget quality, MIA robustness, and downstream accuracy, with reasonable time overhead and enhanced resilience to jailbreaking prompts. The approach provides a practical, broadly applicable path to more effective LLM unlearning and motivates further exploration of second-order optimization in data-removal contexts.
Abstract
Large Language Models (LLMs) have highlighted the necessity of effective unlearning mechanisms to comply with data regulations and ethical AI practices. LLM unlearning aims at removing undesired data influences and associated model capabilities without compromising utility beyond the scope of unlearning. While interest in studying LLM unlearning is growing, the impact of the optimizer choice for LLM unlearning remains unexplored. In this work, we shed light on the significance of optimizer selection in LLM unlearning for the first time, establishing a clear connection between second-order optimization and influence unlearning (a classical approach using influence functions to update the model for data influence removal). This insight propels us to develop a second-order optimization-based LLM unlearning framework, termed Second-Order UnLearning (SOUL), which extends the static, one-shot model update using influence unlearning to a dynamic, iterative unlearning process. Our extensive experiments show that SOUL consistently outperforms conventional first-order methods across various unlearning tasks, models, and metrics, indicating that second-order optimization offers an effective and broadly applicable solution for LLM unlearning. Codes are available at https://github.com/OPTML-Group/SOUL.
