Private Fine-tuning of Large Language Models with Zeroth-order Optimization
Xinyu Tang, Ashwinee Panda, Milad Nasr, Saeed Mahloujifar, Prateek Mittal
TL;DR
This work tackles private fine-tuning of large language models by replacing gradient-based DP training with zeroth-order optimization that privatizes only a scalar step size. The DP-ZO method uses SPSA-like updates, Poisson subsampling, and either Gaussian or Laplace noise to ensure $(\varepsilon,\delta)$-DP, achieving memory efficiency and scalability to models up to 66B parameters. Empirical results across SQuAD, DROP, and SST2 show DP-ZO can match DP-SGD performance at similar model sizes and enable nontrivial pure $\varepsilon$-DP utility, with notable memory advantages, especially for long sequences. The approach offers a practical, scalable pathway for privacy-preserving fine-tuning of foundation models, with potential extensions to other domains and DP mechanisms.
Abstract
Differentially private stochastic gradient descent (DP-SGD) allows models to be trained in a privacy-preserving manner, but has proven difficult to scale to the era of foundation models. We introduce DP-ZO, a private fine-tuning framework for large language models by privatizing zeroth order optimization methods. A key insight into the design of our method is that the direction of the gradient in the zeroth-order optimization we use is random and the only information from training data is the step size, i.e., a scalar. Therefore, we only need to privatize the scalar step size, which is memory-efficient. DP-ZO provides a strong privacy-utility trade-off across different tasks, and model sizes that are comparable to DP-SGD in $(\varepsilon,δ)$-DP. Notably, DP-ZO possesses significant advantages over DP-SGD in memory efficiency, and obtains higher utility in $\varepsilon$-DP when using the Laplace mechanism.
