Table of Contents
Fetching ...

Fed-AugMix: Balancing Privacy and Utility via Data Augmentation

Haoyang Li, Wei Chen, Xiaojin Zhang

TL;DR

This work tackles gradient leakage in federated learning by proposing Fed-AugMix, a client-side augmentation framework that uses AugMix to inject distortion into gradients and a Jensen-Shannon divergence loss to enforce prediction consistency across augmented views. By combining AugMix with Loss Scaling, the approach achieves a favorable privacy-utility trade-off, often preserving or even improving accuracy while substantially hindering data reconstruction attacks such as InvGrad. The key contributions include implementing AugMix at the client level, integrating a JS-divergence–based loss to safeguard privacy, and validating the method across MNIST and CIFAR datasets with multiple FL baselines. The results demonstrate robust privacy protection with competitive or superior performance, highlighting the practical potential of data augmentation as a privacy-preserving mechanism in federated learning.

Abstract

Gradient leakage attacks pose a significant threat to the privacy guarantees of federated learning. While distortion-based protection mechanisms are commonly employed to mitigate this issue, they often lead to notable performance degradation. Existing methods struggle to preserve model performance while ensuring privacy. To address this challenge, we propose a novel data augmentation-based framework designed to achieve a favorable privacy-utility trade-off, with the potential to enhance model performance in certain cases. Our framework incorporates the AugMix algorithm at the client level, enabling data augmentation with controllable severity. By integrating the Jensen-Shannon divergence into the loss function, we embed the distortion introduced by AugMix into the model gradients, effectively safeguarding privacy against deep leakage attacks. Moreover, the JS divergence promotes model consistency across different augmentations of the same image, enhancing both robustness and performance. Extensive experiments on benchmark datasets demonstrate the effectiveness and stability of our method in protecting privacy. Furthermore, our approach maintains, and in some cases improves, model performance, showcasing its ability to achieve a robust privacy-utility trade-off.

Fed-AugMix: Balancing Privacy and Utility via Data Augmentation

TL;DR

This work tackles gradient leakage in federated learning by proposing Fed-AugMix, a client-side augmentation framework that uses AugMix to inject distortion into gradients and a Jensen-Shannon divergence loss to enforce prediction consistency across augmented views. By combining AugMix with Loss Scaling, the approach achieves a favorable privacy-utility trade-off, often preserving or even improving accuracy while substantially hindering data reconstruction attacks such as InvGrad. The key contributions include implementing AugMix at the client level, integrating a JS-divergence–based loss to safeguard privacy, and validating the method across MNIST and CIFAR datasets with multiple FL baselines. The results demonstrate robust privacy protection with competitive or superior performance, highlighting the practical potential of data augmentation as a privacy-preserving mechanism in federated learning.

Abstract

Gradient leakage attacks pose a significant threat to the privacy guarantees of federated learning. While distortion-based protection mechanisms are commonly employed to mitigate this issue, they often lead to notable performance degradation. Existing methods struggle to preserve model performance while ensuring privacy. To address this challenge, we propose a novel data augmentation-based framework designed to achieve a favorable privacy-utility trade-off, with the potential to enhance model performance in certain cases. Our framework incorporates the AugMix algorithm at the client level, enabling data augmentation with controllable severity. By integrating the Jensen-Shannon divergence into the loss function, we embed the distortion introduced by AugMix into the model gradients, effectively safeguarding privacy against deep leakage attacks. Moreover, the JS divergence promotes model consistency across different augmentations of the same image, enhancing both robustness and performance. Extensive experiments on benchmark datasets demonstrate the effectiveness and stability of our method in protecting privacy. Furthermore, our approach maintains, and in some cases improves, model performance, showcasing its ability to achieve a robust privacy-utility trade-off.

Paper Structure

This paper contains 14 sections, 6 equations, 9 figures, 3 tables, 2 algorithms.

Figures (9)

  • Figure 1: An illustration of training process of Fed-AugMix. Client Updata consists of two parts: (1) In data augmentation part, we generate two augmented data based on original data; (2) In model updating part, we first compute the JS divergence of two augmented data and the original data, then add the divergence to our loss, based on which we update the parameter.
  • Figure 2: An example of AugMix. First generate x_aug using three stochastic augmentation chains. Then employ "skip connection" to MixUp the augmented image and the original image.
  • Figure 3: Visualization of InvGrad attack results under varying privacy protection severity levels (s=0, 2, 6, 10). The second row shows reconstructions without protection, while the following rows display reconstructions with Fed-AugMix at different augmentation severities.
  • Figure 4: Relationship between test accuracy and MSE of reconstructed images under varying protection severity. Lower severity reduces MSE while improving accuracy compared to no protection. However, as augmentation severity increases, accuracy decreases for both untrained and converged models.
  • Figure 5: An illustration of the relationship between accuracy, SSIM, and PSNR for untrained and converged models across different datasets. The green region indicates improved accuracy alongside better privacy protection, while the red region reflects enhanced privacy protection at the cost of performance degradation. Most privacy protection mechanisms of FL fall within the red region.
  • ...and 4 more figures