Table of Contents
Fetching ...

Federated Low-Rank Adaptation with Differential Privacy over Wireless Networks

Tianqu Kang, Zixin Wang, Hengtao He, Jun Zhang, Shenghui Song, Khaled B. Letaief

TL;DR

A split FedFT framework with differential privacy (DP) over wireless networks, where the inherent wireless channel noise in the uplink transmission is utilized to achieve DP guarantees without adding an extra artificial noise, is proposed.

Abstract

Fine-tuning large pre-trained foundation models (FMs) on distributed edge devices presents considerable computational and privacy challenges. Federated fine-tuning (FedFT) mitigates some privacy issues by facilitating collaborative model training without the need to share raw data. To lessen the computational burden on resource-limited devices, combining low-rank adaptation (LoRA) with federated learning enables parameter-efficient fine-tuning. Additionally, the split FedFT architecture partitions an FM between edge devices and a central server, reducing the necessity for complete model deployment on individual devices. However, the risk of privacy eavesdropping attacks in FedFT remains a concern, particularly in sensitive areas such as healthcare and finance. In this paper, we propose a split FedFT framework with differential privacy (DP) over wireless networks, where the inherent wireless channel noise in the uplink transmission is utilized to achieve DP guarantees without adding an extra artificial noise. We shall investigate the impact of the wireless noise on convergence performance of the proposed framework. We will also show that by updating only one of the low-rank matrices in the split FedFT with DP, the proposed method can mitigate the noise amplification effect. Simulation results will demonstrate that the proposed framework achieves higher accuracy under strict privacy budgets compared to baseline methods.

Federated Low-Rank Adaptation with Differential Privacy over Wireless Networks

TL;DR

A split FedFT framework with differential privacy (DP) over wireless networks, where the inherent wireless channel noise in the uplink transmission is utilized to achieve DP guarantees without adding an extra artificial noise, is proposed.

Abstract

Fine-tuning large pre-trained foundation models (FMs) on distributed edge devices presents considerable computational and privacy challenges. Federated fine-tuning (FedFT) mitigates some privacy issues by facilitating collaborative model training without the need to share raw data. To lessen the computational burden on resource-limited devices, combining low-rank adaptation (LoRA) with federated learning enables parameter-efficient fine-tuning. Additionally, the split FedFT architecture partitions an FM between edge devices and a central server, reducing the necessity for complete model deployment on individual devices. However, the risk of privacy eavesdropping attacks in FedFT remains a concern, particularly in sensitive areas such as healthcare and finance. In this paper, we propose a split FedFT framework with differential privacy (DP) over wireless networks, where the inherent wireless channel noise in the uplink transmission is utilized to achieve DP guarantees without adding an extra artificial noise. We shall investigate the impact of the wireless noise on convergence performance of the proposed framework. We will also show that by updating only one of the low-rank matrices in the split FedFT with DP, the proposed method can mitigate the noise amplification effect. Simulation results will demonstrate that the proposed framework achieves higher accuracy under strict privacy budgets compared to baseline methods.

Paper Structure

This paper contains 14 sections, 1 theorem, 29 equations, 3 figures, 1 table.

Key Result

Theorem 1

Upon the fading channel with channel noise $\boldsymbol{n}_k(t)$ and training epoch $T$, there exist constants $c_1$ that the gradient transmission of edge device $k$ satisfies $(\varepsilon, \delta)$-local differential privacy, such that

Figures (3)

  • Figure 1: System model of the proposed LoRA-based FedFT, focusing on the communication between the $k$-th edge device and the edge server. The computation-intensive encoder resides at the edge server, while the embedding and task modules are on the edge devices. The forward pass is represented by solid black arrows, while the backward pass is shown with dashed arrows.
  • Figure 2: Computation graph of the $i$-th layer in the LoRA architecture during backpropagation, illustrating the propagation of gradient noise. The forward pass is represented by solid black arrows, while the backward pass is shown with dashed arrows. Noise introduced in the gradient during backpropagation is indicated by red text beneath the corresponding backward arrows.
  • Figure 3: Test accuracy per epoch for different training configurations across varying privacy budgets ($\varepsilon = 3$, $\varepsilon = 5$, $\varepsilon = 10$, and $\varepsilon = 100$). The configurations are: (1) updating both low-rank matrices $\mathbf{A}$ and $\mathbf{B}$ (vanilla LoRA), (2) updating only $\mathbf{B}$ with $\mathbf{A}$ fixed as a Gaussian matrix, and (3) updating only $\mathbf{B}$ with $\mathbf{A}$ initialized as a scaled orthonormal matrix.

Theorems & Definitions (4)

  • Definition 1: $(\varepsilon, \delta)$-DP and $\ell_2$ sensitivity
  • Theorem 1
  • proof
  • Remark 1