Table of Contents
Fetching ...

FedFG: Privacy-Preserving and Robust Federated Learning via Flow-Matching Generation

Ruiyang Wang, Rong Pan, Zhengan Yao

Abstract

Federated learning (FL) enables distributed clients to collaboratively train a global model using local private data. Nevertheless, recent studies show that conventional FL algorithms still exhibit deficiencies in privacy protection, and the server lacks a reliable and stable aggregation rule for updating the global model. This situation creates opportunities for adversaries: on the one hand, they may eavesdrop on uploaded gradients or model parameters, potentially leaking benign clients' private data; on the other hand, they may compromise clients to launch poisoning attacks that corrupt the global model. To balance accuracy and security, we propose FedFG, a robust FL framework based on flow-matching generation that simultaneously preserves client privacy and resists sophisticated poisoning attacks. On the client side, each local network is decoupled into a private feature extractor and a public classifier. Each client is further equipped with a flow-matching generator that replaces the extractor when interacting with the server, thereby protecting private features while learning an approximation of the underlying data distribution. Complementing the client-side design, the server employs a client-update verification scheme and a novel robust aggregation mechanism driven by synthetic samples produced by the flow-matching generator. Experiments on MNIST, FMNIST, and CIFAR-10 demonstrate that, compared with prior work, our approach adapts to multiple attack strategies and achieves higher accuracy while maintaining strong privacy protection.

FedFG: Privacy-Preserving and Robust Federated Learning via Flow-Matching Generation

Abstract

Federated learning (FL) enables distributed clients to collaboratively train a global model using local private data. Nevertheless, recent studies show that conventional FL algorithms still exhibit deficiencies in privacy protection, and the server lacks a reliable and stable aggregation rule for updating the global model. This situation creates opportunities for adversaries: on the one hand, they may eavesdrop on uploaded gradients or model parameters, potentially leaking benign clients' private data; on the other hand, they may compromise clients to launch poisoning attacks that corrupt the global model. To balance accuracy and security, we propose FedFG, a robust FL framework based on flow-matching generation that simultaneously preserves client privacy and resists sophisticated poisoning attacks. On the client side, each local network is decoupled into a private feature extractor and a public classifier. Each client is further equipped with a flow-matching generator that replaces the extractor when interacting with the server, thereby protecting private features while learning an approximation of the underlying data distribution. Complementing the client-side design, the server employs a client-update verification scheme and a novel robust aggregation mechanism driven by synthetic samples produced by the flow-matching generator. Experiments on MNIST, FMNIST, and CIFAR-10 demonstrate that, compared with prior work, our approach adapts to multiple attack strategies and achieves higher accuracy while maintaining strong privacy protection.

Paper Structure

This paper contains 33 sections, 3 theorems, 41 equations, 6 figures, 4 tables, 2 algorithms.

Key Result

Lemma 4.1

Under Assumptions ass:smooth and ass:local, if $\eta_r Q \le \min\{1/(2L),\,1\}$, then on the event $\mathcal{E}^r$: where $c_1 = 3/4$ and $c_2 = LV/2 + B^2$.

Figures (6)

  • Figure 1: Overview of FedFG. On each client, the local model is decoupled into a private feature extractor and a public classifier, and is equipped with a flow-matching generator that replaces the extractor during communication with the server. The server collects model parameters from clients and performs robust aggregation with three modules: 1) evaluation of outlier and accuracy scores via synthetic samples generated by the flow-matching generator; 2) detection of malicious clients based on the Hampel rule and accuracy threshold; and 3) accuracy-aware reweighting aggregation over the remaining benign clients.
  • Figure 2: Comparison of ground truth images and recovered images under DLG and IG attacks. The PSNR (in dB) and SSIM values are listed sequentially below each image. The first row reports results on MNIST, the second row reports results on FMNIST, and the third and fourth rows report results on CIFAR10.
  • Figure 3: The statistical results of the outlier scores for malicious and benign clients and dynamic thresholds under attacks vary with epochs on MNIST.
  • Figure 4: Test-accuracy trajectories of FL defense methods under different distribution settings, with the attack starting at round 20.
  • Figure 5: Test accuracy of FL methods on FMNIST and FMNIST-0.5 under SF and IPM as the proportion of malicious clients increases.
  • ...and 1 more figures

Theorems & Definitions (7)

  • Lemma 4.1: One-step descent
  • proof
  • Lemma 4.2: Weight perturbation
  • proof
  • Theorem 4.1: Convergence of FedFG
  • proof
  • Remark 4.1: Convergence rate interpretation