Table of Contents
Fetching ...

Lightweight and Robust Federated Data Valuation

Guojun Tang, Jiayu Zhou, Mohammad Mamun, Steve Drew

TL;DR

FedIF addresses robustness in federated learning under non-IID data and adversarial participants by replacing computationally heavy Shapley-value calculations with trajectory-based influence estimation. It computes lightweight per-client influence scores from local updates and a public validation gradient, then applies normalized and smoothed influence weights to adaptively aggregate client models. The authors provide a theoretical bound showing FedIF can tighten the one-step loss-change bound in noisy environments, and empirical results on CIFAR-10 and Fashion-MNIST demonstrate comparable or superior robustness to SV-based methods while reducing aggregation overhead by up to 450x. Ablation studies confirm the importance of local weight normalization, round normalization, and smoothing, highlighting FedIF as a scalable, practical alternative for robust FL in real-world deployments, though adversarial attacks like PGD remain challenging in extreme settings.

Abstract

Federated learning (FL) faces persistent robustness challenges due to non-IID data distributions and adversarial client behavior. A promising mitigation strategy is contribution evaluation, which enables adaptive aggregation by quantifying each client's utility to the global model. However, state-of-the-art Shapley-value-based approaches incur high computational overhead due to repeated model reweighting and inference, which limits their scalability. We propose FedIF, a novel FL aggregation framework that leverages trajectory-based influence estimation to efficiently compute client contributions. FedIF adapts decentralized FL by introducing normalized and smoothed influence scores computed from lightweight gradient operations on client updates and a public validation set. Theoretical analysis demonstrates that FedIF yields a tighter bound on one-step global loss change under noisy conditions. Extensive experiments on CIFAR-10 and Fashion-MNIST show that FedIF achieves robustness comparable to or exceeding SV-based methods in the presence of label noise, gradient noise, and adversarial samples, while reducing aggregation overhead by up to 450x. Ablation studies confirm the effectiveness of FedIF's design choices, including local weight normalization and influence smoothing. Our results establish FedIF as a practical, theoretically grounded, and scalable alternative to Shapley-value-based approaches for efficient and robust FL in real-world deployments.

Lightweight and Robust Federated Data Valuation

TL;DR

FedIF addresses robustness in federated learning under non-IID data and adversarial participants by replacing computationally heavy Shapley-value calculations with trajectory-based influence estimation. It computes lightweight per-client influence scores from local updates and a public validation gradient, then applies normalized and smoothed influence weights to adaptively aggregate client models. The authors provide a theoretical bound showing FedIF can tighten the one-step loss-change bound in noisy environments, and empirical results on CIFAR-10 and Fashion-MNIST demonstrate comparable or superior robustness to SV-based methods while reducing aggregation overhead by up to 450x. Ablation studies confirm the importance of local weight normalization, round normalization, and smoothing, highlighting FedIF as a scalable, practical alternative for robust FL in real-world deployments, though adversarial attacks like PGD remain challenging in extreme settings.

Abstract

Federated learning (FL) faces persistent robustness challenges due to non-IID data distributions and adversarial client behavior. A promising mitigation strategy is contribution evaluation, which enables adaptive aggregation by quantifying each client's utility to the global model. However, state-of-the-art Shapley-value-based approaches incur high computational overhead due to repeated model reweighting and inference, which limits their scalability. We propose FedIF, a novel FL aggregation framework that leverages trajectory-based influence estimation to efficiently compute client contributions. FedIF adapts decentralized FL by introducing normalized and smoothed influence scores computed from lightweight gradient operations on client updates and a public validation set. Theoretical analysis demonstrates that FedIF yields a tighter bound on one-step global loss change under noisy conditions. Extensive experiments on CIFAR-10 and Fashion-MNIST show that FedIF achieves robustness comparable to or exceeding SV-based methods in the presence of label noise, gradient noise, and adversarial samples, while reducing aggregation overhead by up to 450x. Ablation studies confirm the effectiveness of FedIF's design choices, including local weight normalization and influence smoothing. Our results establish FedIF as a practical, theoretically grounded, and scalable alternative to Shapley-value-based approaches for efficient and robust FL in real-world deployments.

Paper Structure

This paper contains 17 sections, 2 theorems, 22 equations, 1 figure, 8 tables, 1 algorithm.

Key Result

Lemma 1

Using Assumption local_dissim, we can obtain the other bounded relation between the local objective and the global objective $\sum_ip^i\left \lVert \nabla F^i(w) \right \rVert^2 \le \beta^2 + \left \lVert \nabla F(w) \right \rVert^2$.

Figures (1)

  • Figure 1: Overview of FedIF, a novel FL aggregation framework that leverages trajectory-based influence estimation to efficiently compute client contributions.

Theorems & Definitions (4)

  • Lemma 1
  • Theorem 1: Bounded One-step Loss Change
  • Remark 1
  • Remark 2