Table of Contents
Fetching ...

FLea: Addressing Data Scarcity and Label Skew in Federated Learning via Privacy-preserving Feature Augmentation

Tong Xia, Abhirup Ghosh, Xinchi Qiu, Cecilia Mascolo

TL;DR

FLea tackles the twin FL challenges of data scarcity and label skew by sharing privacy-preserving intermediate activations through a global feature buffer and augmenting local training with representation-space mix-ups. It integrates a distillation term to curb local drift and a distance-correlation loss to reduce leakage, while updating the global model via FedAvg-style aggregation. Empirical results across image, audio, and sensor datasets show FLea outperforms 13 of 18 baselines by more than 5% in accuracy in many settings and significantly mitigates privacy risks associated with feature sharing. The method offers a practical balance between improving global performance under data constraints and preserving client privacy in cross-device FL.

Abstract

Federated Learning (FL) enables model development by leveraging data distributed across numerous edge devices without transferring local data to a central server. However, existing FL methods still face challenges when dealing with scarce and label-skewed data across devices, resulting in local model overfitting and drift, consequently hindering the performance of the global model. In response to these challenges, we propose a pioneering framework called \textit{FLea}, incorporating the following key components: \textit{i)} A global feature buffer that stores activation-target pairs shared from multiple clients to support local training. This design mitigates local model drift caused by the absence of certain classes; \textit{ii)} A feature augmentation approach based on local and global activation mix-ups for local training. This strategy enlarges the training samples, thereby reducing the risk of local overfitting; \textit{iii)} An obfuscation method to minimize the correlation between intermediate activations and the source data, enhancing the privacy of shared features. To verify the superiority of \textit{FLea}, we conduct extensive experiments using a wide range of data modalities, simulating different levels of local data scarcity and label skew. The results demonstrate that \textit{FLea} consistently outperforms state-of-the-art FL counterparts (among 13 of the experimented 18 settings, the improvement is over $5\%$) while concurrently mitigating the privacy vulnerabilities associated with shared features. Code is available at https://github.com/XTxiatong/FLea.git

FLea: Addressing Data Scarcity and Label Skew in Federated Learning via Privacy-preserving Feature Augmentation

TL;DR

FLea tackles the twin FL challenges of data scarcity and label skew by sharing privacy-preserving intermediate activations through a global feature buffer and augmenting local training with representation-space mix-ups. It integrates a distillation term to curb local drift and a distance-correlation loss to reduce leakage, while updating the global model via FedAvg-style aggregation. Empirical results across image, audio, and sensor datasets show FLea outperforms 13 of 18 baselines by more than 5% in accuracy in many settings and significantly mitigates privacy risks associated with feature sharing. The method offers a practical balance between improving global performance under data constraints and preserving client privacy in cross-device FL.

Abstract

Federated Learning (FL) enables model development by leveraging data distributed across numerous edge devices without transferring local data to a central server. However, existing FL methods still face challenges when dealing with scarce and label-skewed data across devices, resulting in local model overfitting and drift, consequently hindering the performance of the global model. In response to these challenges, we propose a pioneering framework called \textit{FLea}, incorporating the following key components: \textit{i)} A global feature buffer that stores activation-target pairs shared from multiple clients to support local training. This design mitigates local model drift caused by the absence of certain classes; \textit{ii)} A feature augmentation approach based on local and global activation mix-ups for local training. This strategy enlarges the training samples, thereby reducing the risk of local overfitting; \textit{iii)} An obfuscation method to minimize the correlation between intermediate activations and the source data, enhancing the privacy of shared features. To verify the superiority of \textit{FLea}, we conduct extensive experiments using a wide range of data modalities, simulating different levels of local data scarcity and label skew. The results demonstrate that \textit{FLea} consistently outperforms state-of-the-art FL counterparts (among 13 of the experimented 18 settings, the improvement is over ) while concurrently mitigating the privacy vulnerabilities associated with shared features. Code is available at https://github.com/XTxiatong/FLea.git
Paper Structure (32 sections, 11 equations, 13 figures, 6 tables)

This paper contains 32 sections, 11 equations, 13 figures, 6 tables.

Figures (13)

  • Figure 1: Edge devices as clients in federated learning, where local data exhibits label skew (presented by different markers) and scarcity (usually very small in size).
  • Figure 2: Performance of FL methods with increasing data scarcity levels (A smaller $|\mathcal{D}_k|$ indicates a heavier scarcity).
  • Figure 3: T-SNE for low-dimension features where the color distinguishes classes and the class separation measurement DB under different numbers of training samples.
  • Figure 4: Data augmentations. From (a) to (c), the privacy vulnerability is reduced. (b) is the average of a batch of samples like (a), but if the local data contains individual context information (e.g., (a*)), averaging over those samples cannot protect such information (e.g., (b*)). (c) shows a feature of (a*) and (c*) shows its reconstruction.
  • Figure 5: Overview of FLea for $t$-th communication round.
  • ...and 8 more figures