Table of Contents
Fetching ...

FLMarket: Enabling Privacy-preserved Pre-training Data Pricing for Federated Learning

Zhenyu Wen, Wanglei Feng, Di Wu, Haozhen Hu, Chang Xu, Bin Qian, Zhen Hong, Cong Wang, Shouling Ji

TL;DR

This work proposes FLMarket that integrates a two-stage pricing mechanism with a security protocol to address the utility-privacy conflict and shows that the client selection according to FLMarket can achieve more than 10% higher accuracy in subsequent FL training compared to state-of-the-art methods.

Abstract

Federated Learning (FL), as a mainstream privacy-preserving machine learning paradigm, offers promising solutions for privacy-critical domains such as healthcare and finance. Although extensive efforts have been dedicated from both academia and industry to improve the vanilla FL, little work focuses on the data pricing mechanism. In contrast to the straightforward in/post-training pricing techniques, we study a more difficult problem of pre-training pricing without direct information from the learning process. We propose FLMarket that integrates a two-stage, auction-based pricing mechanism with a security protocol to address the utility-privacy conflict. Through comprehensive experiments, we show that the client selection according to FLMarket can achieve more than 10% higher accuracy in subsequent FL training compared to state-of-the-art methods. In addition, it outperforms the in-training baseline with more than 2% accuracy increase and 3x run-time speedup.

FLMarket: Enabling Privacy-preserved Pre-training Data Pricing for Federated Learning

TL;DR

This work proposes FLMarket that integrates a two-stage pricing mechanism with a security protocol to address the utility-privacy conflict and shows that the client selection according to FLMarket can achieve more than 10% higher accuracy in subsequent FL training compared to state-of-the-art methods.

Abstract

Federated Learning (FL), as a mainstream privacy-preserving machine learning paradigm, offers promising solutions for privacy-critical domains such as healthcare and finance. Although extensive efforts have been dedicated from both academia and industry to improve the vanilla FL, little work focuses on the data pricing mechanism. In contrast to the straightforward in/post-training pricing techniques, we study a more difficult problem of pre-training pricing without direct information from the learning process. We propose FLMarket that integrates a two-stage, auction-based pricing mechanism with a security protocol to address the utility-privacy conflict. Through comprehensive experiments, we show that the client selection according to FLMarket can achieve more than 10% higher accuracy in subsequent FL training compared to state-of-the-art methods. In addition, it outperforms the in-training baseline with more than 2% accuracy increase and 3x run-time speedup.

Paper Structure

This paper contains 31 sections, 11 theorems, 11 equations, 13 figures, 4 tables, 1 algorithm.

Key Result

Theorem 3.1

A bidding mechanism is truthful if and only if singer2010budget:

Figures (13)

  • Figure 1: FLMarket high-level system overview
  • Figure 2: The impact of data quantity and distribution for FL training: a) An incremental increase of 1,000 data points has shown a diminishing rate of improvement in global accuracy; b) Imbalanced categories in both local and global data result in the worst performance.
  • Figure 3: CIFAR-10: $n$ clients selected from 20 clients
  • Figure 4: CINIC-10: $n$ clients selected from 20 clients
  • Figure 5: DEAP: $n$ clients selected from 20 clients
  • ...and 8 more figures

Theorems & Definitions (11)

  • Theorem 3.1
  • Lemma 3.2
  • Lemma 3.3
  • Theorem 3.4
  • Theorem 3.5
  • Lemma 3.6
  • Theorem 3.7
  • Lemma D.1: ben2010theory
  • Theorem G.1
  • Theorem G.2
  • ...and 1 more