Characterizing the Training Dynamics of Private Fine-tuning with Langevin diffusion
Shuqi Ke, Charlie Hou, Sewoong Oh, Giulia Fanti
TL;DR
This work analyzes the training dynamics of privately fine-tuning pretrained backbones and reveals that DP-FFT can distort backbone representations due to misalignment with a randomly initialized linear head. It introduces a zeroth-order Langevin-diffusion approximation that preserves multi-layer interactions while enabling tractable analysis of DP-SGD, and shows that a DP-LP pre-phase can mitigate early feature distortion via representation alignment. The authors derive convergence bounds for both DP-LP and DP-FFT in a simple 2-layer ReLU setting, and provide a theory-backed budget-allocation framework indicating when to favor LP versus FFT under privacy constraints. Experiments on real datasets validate the theory, illustrate the distortion-and-alignment dynamics, and demonstrate practical privacy-utility trade-offs across architectures and benchmarks. Overall, the work offers principled guidance for designing multi-phase private fine-tuning strategies and highlights a path toward understanding privacy-budget allocation in complex models.
Abstract
We show that differentially private full fine-tuning (DP-FFT) can distort pre-trained backbone features based on both theoretical and empirical results. We identify the cause of the distortion as the misalignment between the pre-trained backbone and the randomly initialized linear head. We prove that a sequential fine-tuning strategy can mitigate the feature distortion: first-linear-probing-then-fine-tuning (DP-LP-FFT). A new approximation scheme allows us to derive approximate upper and lower bounds on the training loss of DP-LP and DP-FFT, in a simple but canonical setting of 2-layer neural networks with ReLU activation. Experiments on real-world datasets and architectures are consistent with our theoretical insights. We also derive new upper bounds for 2-layer linear networks without the approximation. Moreover, our theory suggests a trade-off of privacy budget allocation in multi-phase fine-tuning methods like DP-LP-FFT.
