Mitigating Unintended Memorization with LoRA in Federated Learning for LLMs
Thierry Bossy, Julien Vignoud, Tahseen Rabbani, Juan R. Troncoso Pastoriza, Martin Jaggi
TL;DR
The paper addresses unintended memorization in FL-trained LLMs and demonstrates that LoRA fine-tuning substantially reduces data regurgitation without sacrificing predictive utility. It provides extensive empirical evidence across centralized and federated settings, leveraging Llama-2/3 and Mistral models on medical QA tasks, and shows LoRA's compatibility with Goldfish loss, gradient clipping, Gaussian noise, secure aggregation, and DP mechanisms. Key findings include up to a 10x reduction in memorization, modest accuracy trade-offs, and dramatic reductions in communication overhead in FL. The work offers practical guidance for privacy-preserving LLM fine-tuning in data-sensitive domains and suggests avenues for theoretical analysis of the mechanisms behind LoRA's memorization mitigation. Overall, LoRA emerges as a lightweight, scalable tool to enhance privacy in FL without compromising performance, especially when combined with complementary privacy techniques.
Abstract
Federated learning (FL) is a popular paradigm for collaborative training which avoids direct data exposure between clients. However, data privacy issues still remain: FL-trained large language models are capable of memorizing and completing phrases and sentences contained in training data when given with their prefixes. Thus, it is possible for adversarial and honest-but-curious clients to recover training data of other participants simply through targeted prompting. In this work, we demonstrate that a popular and simple fine-tuning strategy, low-rank adaptation (LoRA), reduces memorization during FL up to a factor of 10. We study this effect by performing a medical question-answering fine-tuning task and injecting multiple replicas of out-of-distribution sensitive sequences drawn from an external clinical dataset. We observe a reduction in memorization for a wide variety of Llama 2 and 3 models, and find that LoRA can reduce memorization in centralized learning as well. Furthermore, we show that LoRA can be combined with other privacy-preserving techniques such as gradient clipping and Gaussian noising, secure aggregation, and Goldfish loss to further improve record-level privacy while maintaining performance.
