FedProphet: Memory-Efficient Federated Adversarial Training via Robust and Consistent Cascade Learning

Minxue Tang; Yitu Wang; Jingyang Zhang; Louis DiValentin; Aolin Ding; Amin Hass; Yiran Chen; Hai "Helen" Li

FedProphet: Memory-Efficient Federated Adversarial Training via Robust and Consistent Cascade Learning

Minxue Tang, Yitu Wang, Jingyang Zhang, Louis DiValentin, Aolin Ding, Amin Hass, Yiran Chen, Hai "Helen" Li

TL;DR

FedProphet tackles the challenge of memory-efficient Federated Adversarial Training by partitioning a large backbone into cascaded modules that can be trained on memory-constrained devices without model swapping. On the server, a memory-aware model partitioner and a training coordinator with Adaptive Perturbation Adjustment and Differentiated Module Assignment coordinate module allocation and perturbation sizing, while a Partial-Average Model Aggregator compiles heterogeneous updates. Theoretical results show that strong convexity regularization bounds feature perturbations and reduces gradient inconsistency, enabling robustness to transfer from modules to the full backbone. Empirically, FedProphet achieves up to 80% memory reduction and up to 10.8x training-time speedup while maintaining competitive accuracy and adversarial robustness across balanced and unbalanced device settings, outperforming existing memory-efficient FAT baselines. This framework offers practical scalability for robust, privacy-preserving learning on edge devices and suggests promising extensions to NLP and other memory-saving approaches.

Abstract

Federated Adversarial Training (FAT) can supplement robustness against adversarial examples to Federated Learning (FL), promoting a meaningful step toward trustworthy AI. However, FAT requires large models to preserve high accuracy while achieving strong robustness, incurring high memory-swapping latency when training on memory-constrained edge devices. Existing memory-efficient FL methods suffer from poor accuracy and weak robustness due to inconsistent local and global models. In this paper, we propose FedProphet, a novel FAT framework that can achieve memory efficiency, robustness, and consistency simultaneously. FedProphget reduces the memory requirement in local training while guaranteeing adversarial robustness by adversarial cascade learning with strong convexity regularization, and we show that the strong robustness also implies low inconsistency in FedProphet. We also develop a training coordinator on the server of FL, with Adaptive Perturbation Adjustment for utility-robustness balance and Differentiated Module Assignment for objective inconsistency mitigation. FedPeophet significantly outperforms other baselines under different experimental settings, maintaining the accuracy and robustness of end-to-end FAT with 80% memory reduction and up to 10.8x speedup in training time.

FedProphet: Memory-Efficient Federated Adversarial Training via Robust and Consistent Cascade Learning

TL;DR

Abstract

Paper Structure (44 sections, 5 theorems, 21 equations, 10 figures, 8 tables, 2 algorithms)

This paper contains 44 sections, 5 theorems, 21 equations, 10 figures, 8 tables, 2 algorithms.

Introduction
Related Works and Preliminaries
Federated Learning
Adversarial Training
Cascade Learning
Motivations
Overview
Local Client Design
Adversarial Cascade Learning with Strong Convexity Relularization
Sufficient Condition for Backbone Robustness.
Strong Convexity Regularization.
Robustness-Consistency Relationship in Adversarial Cascade Learning
Central Server Design
Memory-constrained Model Partition
Adaptive Perturbation Adjustment
...and 29 more sections

Key Result

Proposition 1

The backbone model $({\bm{w}}_1\circ\cdots\circ {\bm{w}}_M)$We use ${\bm{a}}\circ{\bm{b}}$ to denote a cascade of two modules ${\bm{a}}$ and ${\bm{b}}$, inputting in ${\bm{a}}$ and outputting from ${\bm{b}}$. have $(\epsilon_0,c_M)$-robustness in the joint loss $l$, if for every module $m<M$, we hav and the last module has $(\epsilon_{M-1},c_M)$-robustness in $l_M=l$.

Figures (10)

Figure 1: An illustration of Cascade Learning.
Figure 2: The local training overhead breakdown and latency in two workloads, (a) VGG16 on CIFAR-10 and (b) ResNet34 on Caltech-256. "Suff. Mem" denotes training with sufficient memory resources and "Lim. w/ Swap" denotes training with 20% memory and adopting memory swapping. "Lim. w/o Swap" trains with 20% memory and FedRolex alam2022fedrolex.
Figure 3: A framework of FedProphet. We formalize the framework in \ref{['alg:fedprophet']}.
Figure 4: An illustration of Adversarial Cascade Learning.
Figure 5: An illustration of resource-constrained module assignment with memory and FLOPs constraints.
...and 5 more figures

Theorems & Definitions (9)

Definition 1
Proposition 1
Lemma 1
Remark 1
Lemma 2
Lemma 1
proof
Lemma 2
proof

FedProphet: Memory-Efficient Federated Adversarial Training via Robust and Consistent Cascade Learning

TL;DR

Abstract

FedProphet: Memory-Efficient Federated Adversarial Training via Robust and Consistent Cascade Learning

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (10)

Theorems & Definitions (9)