FACT or Fiction: Can Truthful Mechanisms Eliminate Federated Free Riding?

Marco Bornstein; Amrit Singh Bedi; Abdirisak Mohamed; Furong Huang

FACT or Fiction: Can Truthful Mechanisms Eliminate Federated Free Riding?

Marco Bornstein, Amrit Singh Bedi, Abdirisak Mohamed, Furong Huang

TL;DR

This paper tackles the free-rider problem and potential misreporting in Federated Learning by introducing Federated Agent Cost Truthfulness (Fact). Fact combines a Penalized Federated Learning (PFL) loss with a sandwich-style truthfulness competition to enforce locally optimal data usage and truthful cost reporting, without relying on auctions. The authors prove that truthful agents following Fact achieve a lower, reshaped loss than local training and that free riding is eliminated, while remaining individually rational. Empirical results on CIFAR-10, MNIST, and HAM10000 demonstrate substantial reductions in agent loss (up to ~4x) and validate the method's practicality in realistic, non-iid, multi-institution settings.

Abstract

Standard federated learning (FL) approaches are vulnerable to the free-rider dilemma: participating agents can contribute little to nothing yet receive a well-trained aggregated model. While prior mechanisms attempt to solve the free-rider dilemma, none have addressed the issue of truthfulness. In practice, adversarial agents can provide false information to the server in order to cheat its way out of contributing to federated training. In an effort to make free-riding-averse federated mechanisms truthful, and consequently less prone to breaking down in practice, we propose FACT. FACT is the first federated mechanism that: (1) eliminates federated free riding by using a penalty system, (2) ensures agents provide truthful information by creating a competitive environment, and (3) encourages agent participation by offering better performance than training alone. Empirically, FACT avoids free-riding when agents are untruthful, and reduces agent loss by over 4x.

FACT or Fiction: Can Truthful Mechanisms Eliminate Federated Free Riding?

TL;DR

Abstract

Paper Structure (12 sections, 8 theorems, 29 equations, 7 figures, 2 tables, 3 algorithms)

This paper contains 12 sections, 8 theorems, 29 equations, 7 figures, 2 tables, 3 algorithms.

Introduction
Related Works
The Challenge of Free Riding in Federated Learning
Eliminating Free Riding via Penalization
Fact: Eliminating Free-Riding With Untruthful Agents
Experimental Results
Real-World Case Study: Inter-Hospital Skin Cancer Diagnosis
Conclusion
Additional Related Works
Additional Experimental Results
Proofs
Impact Statement & Limitations

Key Result

theorem 1

For an agent $i$ with marginal cost $c_i$, the optimal amount of data $m_{i,l}^*$ used for local training is $m_{i,l}^* := \sqrt{\frac{\gamma \sigma^2 L}{2c_i}}$.

Figures (7)

Figure 1: Enforcement of Agent Truthfulness. Average net improvement in loss over local training is plotted for iid (dotted line) and two non-iid agent distributions (D-0.6: dashed, D-0.3: solid). For both CIFAR10 (left) and MNIST (right), agents maximize their net improvement in loss when they are truthful (0% added) about their true cost. This matches our theory in Theorem \ref{['thm:fact-optimal']}.
Figure 2: Reduction in Agent Loss. The average agent loss for baselines on CIFAR10 (top row) and MNIST (bottom row) under iid (left), and two non-iid data distributions (center: D-0.6, right: D-0.3). Traditional FL is an upper bound on agent loss (if agents did not free ride). Fact improves agent loss over local training by up to a factor of 3x for CIFAR10 and 4x for MNIST.
Figure 3: Elimination of Free Riding via Penalty. The penalty term $P_{fr}(m_i)$ plus data collection costs $c_im_i$ is plotted for CIFAR10 (left) and MNIST (right) for varying data contributions $m_i$. These combined costs are minimized at the local optimum $m_{i, l}^*$, as predicted by Theorem \ref{['thm:pen-fed-optimal']}.
Figure 4: Fact Eliminates Free Riding in Realistic Settings. When training an image classifier for diagnosing skin cancer, agents participating in FACT achieve much lower loss ($66\%$ less) than if they did not participate (left). Agents maximize their improvement in loss over local training when they are truthful; reporting inflated or deflated costs diminishes improvement in loss (middle). Agents minimize penalties when using their locally optimal amount of data ($m^* = 801$) for training (right).
Figure 5: Test Loss for CIFAR10 (top) and MNIST (bottom) in Heterogeneous Settings. FL outperforms local training on iid (left) and mild (middle) & strong (right) non-iid Dirichlet settings.
...and 2 more figures

Theorems & Definitions (26)

theorem 1: Optimal Local Data Usage
remark 1
theorem 2: Free-Riding: Optimal Federated Data Usage
remark 2
theorem 3: PFL Eliminates Free Riding
remark 3
lemma 1: PFL Assurance of IR at Optimum
remark 4
remark 5
theorem 4: Elimination of Federated Free-Riding With Truthful Agents
...and 16 more

FACT or Fiction: Can Truthful Mechanisms Eliminate Federated Free Riding?

TL;DR

Abstract

FACT or Fiction: Can Truthful Mechanisms Eliminate Federated Free Riding?

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (26)