R-SFLLM: Jamming Resilient Framework for Split Federated Learning with Large Language Models

Aladin Djuhera; Vlad C. Andrei; Xinyang Li; Ullrich J. Mönich; Holger Boche; Walid Saad

R-SFLLM: Jamming Resilient Framework for Split Federated Learning with Large Language Models

Aladin Djuhera, Vlad C. Andrei, Xinyang Li, Ullrich J. Mönich, Holger Boche, Walid Saad

TL;DR

The paper addresses adversarial jamming of LLM embeddings in split federated learning over wireless networks by deriving a loss-divergence bound tied to communication MSE under a relaxed coordinate-wise (L0,L1)-smoothness model. It introduces R-SFLLM, a sensing-assisted anti-jamming framework that uses DoA information to formulate a joint beamforming, scheduling, and power-allocation problem, solved via an iterative water-filling approach and supported by a conservative surrogate covariance. A worst-case jamming strategy is developed as a benchmark, and minimum-rate guarantees are established to characterize reliable SFL under interference. Extensive NLP and CV experiments demonstrate near-baseline performance under worst-case jamming when protected by R-SFLLM, confirming the practical value of integrating physical-layer resilience into distributed training for LLMs in future 6G-edge networks.

Abstract

Split federated learning (SFL) is a compute-efficient paradigm in distributed machine learning (ML), where components of large ML models are outsourced to remote servers. A significant challenge in SFL, particularly when deployed over wireless channels, is the susceptibility of transmitted model parameters to adversarial jamming that could jeopardize the learning process. This is particularly pronounced for embedding parameters in large language models (LLMs) and vision language models (VLMs), which are learned feature vectors essential for domain understanding. In this paper, rigorous insights are provided into the influence of jamming embeddings in SFL by deriving an expression for the ML training loss divergence and showing that it is upper-bounded by the mean squared error (MSE). Based on this analysis, a physical layer framework is developed for resilient SFL with LLMs (R-SFLLM) over wireless networks. R-SFLLM leverages wireless sensing data to gather information on the jamming directions-of-arrival (DoAs) for the purpose of devising a novel, sensing-assisted anti-jamming strategy while jointly optimizing beamforming, user scheduling, and resource allocation. Extensive experiments using both LLMs and VLMs demonstrate R-SFLLM's effectiveness, achieving close-to-baseline performance across various natural language processing (NLP) and computer vision (CV) tasks, datasets, and modalities. The proposed methodology further introduces an adversarial training component, where controlled noise exposure significantly enhances the model's resilience to perturbed parameters during training. The results show that more noise-sensitive models, such as RoBERTa, benefit from this feature, especially when resource allocation is unfair. It is also shown that worst-case jamming in particular translates into worst-case model outcomes, thereby necessitating the need for jamming-resilient SFL protocols.

R-SFLLM: Jamming Resilient Framework for Split Federated Learning with Large Language Models

TL;DR

Abstract

Paper Structure (35 sections, 6 theorems, 45 equations, 13 figures, 2 tables, 2 algorithms)

This paper contains 35 sections, 6 theorems, 45 equations, 13 figures, 2 tables, 2 algorithms.

Introduction and Motivation
Adversarial Poisoning in Wireless Federated LLM Training.
Proactive and Resilient-by-Design Anti-Jamming in SFL.
Contributions.
System Model and Adversarial Analysis
Wireless R-SFLLM System Model.
Adversarial Jamming Impact on LLM Training in SFL.
Assumptions on the Loss Function
Upper Bound on the LLM Loss Divergence
Relating the Model Error to the Communication MSE
Practical Interpretation of Results
Minimum System Rate for Reliable SFL with LLMs.
R-SFLLM Anti-Jamming Framework
Anti-Jamming Strategy and Optimization Problem.
Role of Sensing-Assisted Jamming DoA Information.
...and 20 more sections

Key Result

Lemma 1

Let $L$ be $(\boldsymbol{L_0}, \boldsymbol{L_1})$-smooth coordinate-wisely. Then for any $\boldsymbol{x}, \boldsymbol{y} \in \mathbb{R}^E$ with $|| \boldsymbol{x} - \boldsymbol{y} ||_2 \leq \frac{1}{|| \boldsymbol{L_1} ||_{\infty}}$, we have

Figures (13)

Figure 1: SFL model split with LLM embeddings being processed at the client and with attention and head layers being processed at the server.
Figure 2: R-SFLLM system architecture for distributed training over MIMO-OFDM wireless channels, augmented by sensing-assisted anti-jamming capabilities.
Figure 3: SFL setup with $Q = 3$ legitimate parties, one adversarial jammer, and corresponding user and jamming DoAs, denoted by $\theta_{H_q}$ and $\theta_J$, respectively.
Figure 4: Global SFL model performance plots for NLP experiments with BERT/RoBERTa, evaluated after each of the $N_\text{rounds}$ global rounds for all four scenarios (SFL Baseline, Gaussian, No Protection, Protection) using Accuracy and F1 Scores (higher is better). Accuracy is calculated as the ratio of correctly classified sentences/words to the total number of instances, while the F1 Score is the harmonic mean of precision (ratio of true positive observations to the total number of predicted positives) and recall (ratio of true positive observations to the number of actual positives, i.e. the sum of true positives and false negatives).
Figure 5: Results for fine-tuning RoBERTa on QNLI with all client plots.
...and 8 more figures

Theorems & Definitions (13)

Definition 1
Lemma 1
Lemma 2
proof
Corollary 1
Proposition 1
proof
Proposition 2
proof
Corollary 2
...and 3 more

R-SFLLM: Jamming Resilient Framework for Split Federated Learning with Large Language Models

TL;DR

Abstract

R-SFLLM: Jamming Resilient Framework for Split Federated Learning with Large Language Models

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (13)

Theorems & Definitions (13)