Table of Contents
Fetching ...

R-SFLLM: Jamming Resilient Framework for Split Federated Learning with Large Language Models

Aladin Djuhera, Vlad C. Andrei, Xinyang Li, Ullrich J. Mönich, Holger Boche, Walid Saad

TL;DR

The paper addresses adversarial jamming of LLM embeddings in split federated learning over wireless networks by deriving a loss-divergence bound tied to communication MSE under a relaxed coordinate-wise (L0,L1)-smoothness model. It introduces R-SFLLM, a sensing-assisted anti-jamming framework that uses DoA information to formulate a joint beamforming, scheduling, and power-allocation problem, solved via an iterative water-filling approach and supported by a conservative surrogate covariance. A worst-case jamming strategy is developed as a benchmark, and minimum-rate guarantees are established to characterize reliable SFL under interference. Extensive NLP and CV experiments demonstrate near-baseline performance under worst-case jamming when protected by R-SFLLM, confirming the practical value of integrating physical-layer resilience into distributed training for LLMs in future 6G-edge networks.

Abstract

Split federated learning (SFL) is a compute-efficient paradigm in distributed machine learning (ML), where components of large ML models are outsourced to remote servers. A significant challenge in SFL, particularly when deployed over wireless channels, is the susceptibility of transmitted model parameters to adversarial jamming that could jeopardize the learning process. This is particularly pronounced for embedding parameters in large language models (LLMs) and vision language models (VLMs), which are learned feature vectors essential for domain understanding. In this paper, rigorous insights are provided into the influence of jamming embeddings in SFL by deriving an expression for the ML training loss divergence and showing that it is upper-bounded by the mean squared error (MSE). Based on this analysis, a physical layer framework is developed for resilient SFL with LLMs (R-SFLLM) over wireless networks. R-SFLLM leverages wireless sensing data to gather information on the jamming directions-of-arrival (DoAs) for the purpose of devising a novel, sensing-assisted anti-jamming strategy while jointly optimizing beamforming, user scheduling, and resource allocation. Extensive experiments using both LLMs and VLMs demonstrate R-SFLLM's effectiveness, achieving close-to-baseline performance across various natural language processing (NLP) and computer vision (CV) tasks, datasets, and modalities. The proposed methodology further introduces an adversarial training component, where controlled noise exposure significantly enhances the model's resilience to perturbed parameters during training. The results show that more noise-sensitive models, such as RoBERTa, benefit from this feature, especially when resource allocation is unfair. It is also shown that worst-case jamming in particular translates into worst-case model outcomes, thereby necessitating the need for jamming-resilient SFL protocols.

R-SFLLM: Jamming Resilient Framework for Split Federated Learning with Large Language Models

TL;DR

The paper addresses adversarial jamming of LLM embeddings in split federated learning over wireless networks by deriving a loss-divergence bound tied to communication MSE under a relaxed coordinate-wise (L0,L1)-smoothness model. It introduces R-SFLLM, a sensing-assisted anti-jamming framework that uses DoA information to formulate a joint beamforming, scheduling, and power-allocation problem, solved via an iterative water-filling approach and supported by a conservative surrogate covariance. A worst-case jamming strategy is developed as a benchmark, and minimum-rate guarantees are established to characterize reliable SFL under interference. Extensive NLP and CV experiments demonstrate near-baseline performance under worst-case jamming when protected by R-SFLLM, confirming the practical value of integrating physical-layer resilience into distributed training for LLMs in future 6G-edge networks.

Abstract

Split federated learning (SFL) is a compute-efficient paradigm in distributed machine learning (ML), where components of large ML models are outsourced to remote servers. A significant challenge in SFL, particularly when deployed over wireless channels, is the susceptibility of transmitted model parameters to adversarial jamming that could jeopardize the learning process. This is particularly pronounced for embedding parameters in large language models (LLMs) and vision language models (VLMs), which are learned feature vectors essential for domain understanding. In this paper, rigorous insights are provided into the influence of jamming embeddings in SFL by deriving an expression for the ML training loss divergence and showing that it is upper-bounded by the mean squared error (MSE). Based on this analysis, a physical layer framework is developed for resilient SFL with LLMs (R-SFLLM) over wireless networks. R-SFLLM leverages wireless sensing data to gather information on the jamming directions-of-arrival (DoAs) for the purpose of devising a novel, sensing-assisted anti-jamming strategy while jointly optimizing beamforming, user scheduling, and resource allocation. Extensive experiments using both LLMs and VLMs demonstrate R-SFLLM's effectiveness, achieving close-to-baseline performance across various natural language processing (NLP) and computer vision (CV) tasks, datasets, and modalities. The proposed methodology further introduces an adversarial training component, where controlled noise exposure significantly enhances the model's resilience to perturbed parameters during training. The results show that more noise-sensitive models, such as RoBERTa, benefit from this feature, especially when resource allocation is unfair. It is also shown that worst-case jamming in particular translates into worst-case model outcomes, thereby necessitating the need for jamming-resilient SFL protocols.
Paper Structure (35 sections, 6 theorems, 45 equations, 13 figures, 2 tables, 2 algorithms)

This paper contains 35 sections, 6 theorems, 45 equations, 13 figures, 2 tables, 2 algorithms.

Key Result

Lemma 1

Let $L$ be $(\boldsymbol{L_0}, \boldsymbol{L_1})$-smooth coordinate-wisely. Then for any $\boldsymbol{x}, \boldsymbol{y} \in \mathbb{R}^E$ with $|| \boldsymbol{x} - \boldsymbol{y} ||_2 \leq \frac{1}{|| \boldsymbol{L_1} ||_{\infty}}$, we have

Figures (13)

  • Figure 1: SFL model split with LLM embeddings being processed at the client and with attention and head layers being processed at the server.
  • Figure 2: R-SFLLM system architecture for distributed training over MIMO-OFDM wireless channels, augmented by sensing-assisted anti-jamming capabilities.
  • Figure 3: SFL setup with $Q = 3$ legitimate parties, one adversarial jammer, and corresponding user and jamming DoAs, denoted by $\theta_{H_q}$ and $\theta_J$, respectively.
  • Figure 4: Global SFL model performance plots for NLP experiments with BERT/RoBERTa, evaluated after each of the $N_\text{rounds}$ global rounds for all four scenarios (SFL Baseline, Gaussian, No Protection, Protection) using Accuracy and F1 Scores (higher is better). Accuracy is calculated as the ratio of correctly classified sentences/words to the total number of instances, while the F1 Score is the harmonic mean of precision (ratio of true positive observations to the total number of predicted positives) and recall (ratio of true positive observations to the number of actual positives, i.e. the sum of true positives and false negatives).
  • Figure 5: Results for fine-tuning RoBERTa on QNLI with all client plots.
  • ...and 8 more figures

Theorems & Definitions (13)

  • Definition 1
  • Lemma 1
  • Lemma 2
  • proof
  • Corollary 1
  • Proposition 1
  • proof
  • Proposition 2
  • proof
  • Corollary 2
  • ...and 3 more