Table of Contents
Fetching ...

Parameter-Efficient and Personalized Federated Training of Generative Models at the Edge

Kabir Khan, Manju Sarkar, Anita Kar, Suresh Ghosh

TL;DR

FedGen-Edge presents a privacy-conscious, communication-efficient framework for training large generative models at the edge by decoupling a frozen global backbone from small, client-specific LoRA adapters. By federating only the adapters (costs scale with $N_{ ext{LoRA}}$, typically <1% of $N_{ ext{Total}}$), the approach achieves >99% uplink reduction and improved stability under non-IID data, while enabling per-client personalization through local adapter fine-tuning. Empirical results on Penn Treebank and CIFAR-10 show FedGen-Edge outperforming full-model FedAvg and strong personalization baselines in generation quality (PPL and $FID$) and convergence speed, with notable gains from adapter-level aggregation and a well-chosen LoRA rank and local epoch setting. The work demonstrates a practical path toward privacy-preserving, resource-aware, and personalized generative AI on edge devices, with broader implications for on-device LMs and diffusion models in heterogeneous networks.

Abstract

Large generative models (for example, language and diffusion models) enable high-quality text and image synthesis but are hard to train or adapt in cross-device federated settings due to heavy computation and communication and statistical/system heterogeneity. We propose FedGen-Edge, a framework that decouples a frozen, pre-trained global backbone from lightweight client-side adapters and federates only the adapters. Using Low-Rank Adaptation (LoRA) constrains client updates to a compact subspace, which reduces uplink traffic by more than 99 percent versus full-model FedAvg, stabilizes aggregation under non-IID data, and naturally supports personalization because each client can keep a locally tuned adapter. On language modeling (PTB) and image generation (CIFAR-10), FedGen-Edge achieves lower perplexity/FID and faster convergence than strong baselines while retaining a simple FedAvg-style server. A brief ablation shows diminishing returns beyond moderate LoRA rank and a trade-off between local epochs and client drift. FedGen-Edge offers a practical path toward privacy-preserving, resource-aware, and personalized generative AI on heterogeneous edge devices.

Parameter-Efficient and Personalized Federated Training of Generative Models at the Edge

TL;DR

FedGen-Edge presents a privacy-conscious, communication-efficient framework for training large generative models at the edge by decoupling a frozen global backbone from small, client-specific LoRA adapters. By federating only the adapters (costs scale with , typically <1% of ), the approach achieves >99% uplink reduction and improved stability under non-IID data, while enabling per-client personalization through local adapter fine-tuning. Empirical results on Penn Treebank and CIFAR-10 show FedGen-Edge outperforming full-model FedAvg and strong personalization baselines in generation quality (PPL and ) and convergence speed, with notable gains from adapter-level aggregation and a well-chosen LoRA rank and local epoch setting. The work demonstrates a practical path toward privacy-preserving, resource-aware, and personalized generative AI on edge devices, with broader implications for on-device LMs and diffusion models in heterogeneous networks.

Abstract

Large generative models (for example, language and diffusion models) enable high-quality text and image synthesis but are hard to train or adapt in cross-device federated settings due to heavy computation and communication and statistical/system heterogeneity. We propose FedGen-Edge, a framework that decouples a frozen, pre-trained global backbone from lightweight client-side adapters and federates only the adapters. Using Low-Rank Adaptation (LoRA) constrains client updates to a compact subspace, which reduces uplink traffic by more than 99 percent versus full-model FedAvg, stabilizes aggregation under non-IID data, and naturally supports personalization because each client can keep a locally tuned adapter. On language modeling (PTB) and image generation (CIFAR-10), FedGen-Edge achieves lower perplexity/FID and faster convergence than strong baselines while retaining a simple FedAvg-style server. A brief ablation shows diminishing returns beyond moderate LoRA rank and a trade-off between local epochs and client drift. FedGen-Edge offers a practical path toward privacy-preserving, resource-aware, and personalized generative AI on heterogeneous edge devices.

Paper Structure

This paper contains 44 sections, 6 equations, 9 figures, 1 algorithm.

Figures (9)

  • Figure 1: Overall architecture of FedGen-Edge. The server holds a frozen global backbone $M_G(\theta_G)$ and a global adapter $A_G^t$. In each round, the server broadcasts $A_G^t$; selected clients locally train only their LoRA adapters on private data and upload the updated adapters for weighted aggregation to obtain $A_G^{t+1}$. Personalized inference uses $M_G \oplus A_k$.
  • Figure 2: Total upload cost on PTB (500 rounds, simulated). FedGen-Edge communicates only LoRA adapters, yielding $>99\%$ reduction vs. full-model FedAvg.
  • Figure 3: Convergence on PTB: perplexity (lower is better) vs. rounds. FedGen-Edge converges faster and to a better final PPL under Non-IID clients.
  • Figure 4: Convergence on CIFAR-10: FID (lower is better) vs. rounds. FedGen-Edge attains substantially lower FID under skewed label partitions.
  • Figure 5: Distribution of personalization gains on PTB: reduction of PPL after one local epoch of adapter personalization (100 clients). Most clients benefit notably, indicating consistent personalization.
  • ...and 4 more figures