Table of Contents
Fetching ...

FedOAED: Federated On-Device Autoencoder Denoiser for Heterogeneous Data under Limited Client Availability

S M Ruhul Kabir Howlader, Xiao Chen, Yifei Xie, Lu Liu

TL;DR

The paper addresses privacy-preserving learning under heterogeneous data and limited client participation in federated settings. It introduces FedOAED, which adds an on-device autoencoder denoiser trained on recent update snapshots to denoise and convexly mix updates before upload. This approach reduces client-drift and variance from partial participation, improving convergence on Non-IID vision tasks. Experiments on F-MNIST and CIFAR-10 show FedOAED outperforms state-of-the-art baselines without extra communication.

Abstract

Over the last few decades, machine learning (ML) and deep learning (DL) solutions have demonstrated their potential across many applications by leveraging large amounts of high-quality data. However, strict data-sharing regulations such as the General Data Protection Regulation (GDPR) and the Health Insurance Portability and Accountability Act (HIPAA) have prevented many data-driven applications from being realised. Federated Learning (FL), in which raw data never leaves local devices, has shown promise in overcoming these limitations. Although FL has grown rapidly in recent years, it still struggles with heterogeneity, which produces gradient noise, client-drift, and increased variance from partial client participation. In this paper, we propose FedOAED, a novel federated learning algorithm designed to mitigate client-drift arising from multiple local training updates and the variance induced by partial client participation. FedOAED incorporates an on-device autoencoder denoiser on the client side to mitigate client-drift and variance resulting from heterogeneous data under limited client availability. Experiments on multiple vision datasets under Non-IID settings demonstrate that FedOAED consistently outperforms state-of-the-art baselines.

FedOAED: Federated On-Device Autoencoder Denoiser for Heterogeneous Data under Limited Client Availability

TL;DR

The paper addresses privacy-preserving learning under heterogeneous data and limited client participation in federated settings. It introduces FedOAED, which adds an on-device autoencoder denoiser trained on recent update snapshots to denoise and convexly mix updates before upload. This approach reduces client-drift and variance from partial participation, improving convergence on Non-IID vision tasks. Experiments on F-MNIST and CIFAR-10 show FedOAED outperforms state-of-the-art baselines without extra communication.

Abstract

Over the last few decades, machine learning (ML) and deep learning (DL) solutions have demonstrated their potential across many applications by leveraging large amounts of high-quality data. However, strict data-sharing regulations such as the General Data Protection Regulation (GDPR) and the Health Insurance Portability and Accountability Act (HIPAA) have prevented many data-driven applications from being realised. Federated Learning (FL), in which raw data never leaves local devices, has shown promise in overcoming these limitations. Although FL has grown rapidly in recent years, it still struggles with heterogeneity, which produces gradient noise, client-drift, and increased variance from partial client participation. In this paper, we propose FedOAED, a novel federated learning algorithm designed to mitigate client-drift arising from multiple local training updates and the variance induced by partial client participation. FedOAED incorporates an on-device autoencoder denoiser on the client side to mitigate client-drift and variance resulting from heterogeneous data under limited client availability. Experiments on multiple vision datasets under Non-IID settings demonstrate that FedOAED consistently outperforms state-of-the-art baselines.

Paper Structure

This paper contains 12 sections, 10 equations, 1 figure, 1 table, 4 algorithms.

Figures (1)

  • Figure 1: Accuracy (Top row) and Loss (Bottom row) Comparison: Datasets (FMNIST (Left two column), (CIFAR-10 (Right two column)); Data Partitioning Methods (Dirichlet (First and Third column), LQ-2 (Second and Fourth column)).