Table of Contents
Fetching ...

Federated Learning with a Single Shared Image

Sunny Soni, Aaqib Saeed, Yuki M. Asano

TL;DR

This paper tackles the knowledge transfer bottleneck in federated learning when public data and bandwidth are scarce. It introduces a single-shared-image distillation approach that generates a rich set of informative patches (patchification) and uses dynamic pruning (KMeans balancing and entropy pruning) to drive server-side knowledge distillation across heterogeneous client architectures. The method demonstrates superior performance under limited shared dataset budgets and low-bandwidth conditions, and extends to scenarios with heterogeneous client models via client-model mirroring on the server. The work offers a practical, privacy-preserving pathway for FL deployment in data- and resource-constrained environments, with potential applicability to other modalities and KD-based strategies.

Abstract

Federated Learning (FL) enables multiple machines to collaboratively train a machine learning model without sharing of private training data. Yet, especially for heterogeneous models, a key bottleneck remains the transfer of knowledge gained from each client model with the server. One popular method, FedDF, uses distillation to tackle this task with the use of a common, shared dataset on which predictions are exchanged. However, in many contexts such a dataset might be difficult to acquire due to privacy and the clients might not allow for storage of a large shared dataset. To this end, in this paper, we introduce a new method that improves this knowledge distillation method to only rely on a single shared image between clients and server. In particular, we propose a novel adaptive dataset pruning algorithm that selects the most informative crops generated from only a single image. With this, we show that federated learning with distillation under a limited shared dataset budget works better by using a single image compared to multiple individual ones. Finally, we extend our approach to allow for training heterogeneous client architectures by incorporating a non-uniform distillation schedule and client-model mirroring on the server side.

Federated Learning with a Single Shared Image

TL;DR

This paper tackles the knowledge transfer bottleneck in federated learning when public data and bandwidth are scarce. It introduces a single-shared-image distillation approach that generates a rich set of informative patches (patchification) and uses dynamic pruning (KMeans balancing and entropy pruning) to drive server-side knowledge distillation across heterogeneous client architectures. The method demonstrates superior performance under limited shared dataset budgets and low-bandwidth conditions, and extends to scenarios with heterogeneous client models via client-model mirroring on the server. The work offers a practical, privacy-preserving pathway for FL deployment in data- and resource-constrained environments, with potential applicability to other modalities and KD-based strategies.

Abstract

Federated Learning (FL) enables multiple machines to collaboratively train a machine learning model without sharing of private training data. Yet, especially for heterogeneous models, a key bottleneck remains the transfer of knowledge gained from each client model with the server. One popular method, FedDF, uses distillation to tackle this task with the use of a common, shared dataset on which predictions are exchanged. However, in many contexts such a dataset might be difficult to acquire due to privacy and the clients might not allow for storage of a large shared dataset. To this end, in this paper, we introduce a new method that improves this knowledge distillation method to only rely on a single shared image between clients and server. In particular, we propose a novel adaptive dataset pruning algorithm that selects the most informative crops generated from only a single image. With this, we show that federated learning with distillation under a limited shared dataset budget works better by using a single image compared to multiple individual ones. Finally, we extend our approach to allow for training heterogeneous client architectures by incorporating a non-uniform distillation schedule and client-model mirroring on the server side.
Paper Structure (26 sections, 4 equations, 5 figures, 11 tables, 2 algorithms)

This paper contains 26 sections, 4 equations, 5 figures, 11 tables, 2 algorithms.

Figures (5)

  • Figure 1: Illustration of our federated learning algorithm using a single image. Our algorithm works on the principle of generating a common distillation dataset from only one shared single image using deterministic augmentations. To this end, our method dynamically selects the best patches for the training of the global model in the next round using knowledge distillation.
  • Figure 2: Scatter plot of t-SNE embeddings of pruned single image patches during different phases of training, using our method with FedAvg and ResNet-8 on CIFAR10.
  • Figure 3: Comparison of test performance in federated setting using a single image with patch selection compared to the equivalent size of multiple independent training samples from a labeled dataset as shared distillation dataset. We use different rates of FedAvg. initializations to emulate different network bandwidth conditions. Detailed result in Table \ref{['tab:exp_comm_comparison']}.
  • Figure 4: Single Image sources used for our experiments for distillation dataset generation.
  • Figure 5: t-SNE Manifold visualization of distillation dataset corresponding to t-SNE scatter plot in Fig. \ref{['fig:img_viz:tsne_high_accuracy']}).