Table of Contents
Fetching ...

Privacy Drift: Evolving Privacy Concerns in Incremental Learning

Sayyed Farid Ahamed, Soumya Banerjee, Sandip Roy, Aayush Kapoor, Marc Vucovich, Kevin Choi, Abdul Rahman, Edward Bowen, Sachin Shetty

TL;DR

This paper introduces privacy drift, a framework paralleling concept drift to describe how private information leakage evolves during incremental training in Federated Learning. It investigates how data drift, model evolution, and attack-development dynamics influence membership inference risk, using CIFAR-20 non-IID partitions and incremental training to reveal non-monotonic privacy behavior. The study demonstrates a persistent correlation between model accuracy and privacy leakage (MIA AUC) under both centralized and federated settings, highlighting that improving performance can elevate privacy risk. The findings motivate privacy-aware strategies, such as differential privacy and secure aggregation, to balance accuracy and privacy in dynamic, decentralized learning environments.

Abstract

In the evolving landscape of machine learning (ML), Federated Learning (FL) presents a paradigm shift towards decentralized model training while preserving user data privacy. This paper introduces the concept of ``privacy drift", an innovative framework that parallels the well-known phenomenon of concept drift. While concept drift addresses the variability in model accuracy over time due to changes in the data, privacy drift encapsulates the variation in the leakage of private information as models undergo incremental training. By defining and examining privacy drift, this study aims to unveil the nuanced relationship between the evolution of model performance and the integrity of data privacy. Through rigorous experimentation, we investigate the dynamics of privacy drift in FL systems, focusing on how model updates and data distribution shifts influence the susceptibility of models to privacy attacks, such as membership inference attacks (MIA). Our results highlight a complex interplay between model accuracy and privacy safeguards, revealing that enhancements in model performance can lead to increased privacy risks. We provide empirical evidence from experiments on customized datasets derived from CIFAR-100 (Canadian Institute for Advanced Research, 100 classes), showcasing the impact of data and concept drift on privacy. This work lays the groundwork for future research on privacy-aware machine learning, aiming to achieve a delicate balance between model accuracy and data privacy in decentralized environments.

Privacy Drift: Evolving Privacy Concerns in Incremental Learning

TL;DR

This paper introduces privacy drift, a framework paralleling concept drift to describe how private information leakage evolves during incremental training in Federated Learning. It investigates how data drift, model evolution, and attack-development dynamics influence membership inference risk, using CIFAR-20 non-IID partitions and incremental training to reveal non-monotonic privacy behavior. The study demonstrates a persistent correlation between model accuracy and privacy leakage (MIA AUC) under both centralized and federated settings, highlighting that improving performance can elevate privacy risk. The findings motivate privacy-aware strategies, such as differential privacy and secure aggregation, to balance accuracy and privacy in dynamic, decentralized learning environments.

Abstract

In the evolving landscape of machine learning (ML), Federated Learning (FL) presents a paradigm shift towards decentralized model training while preserving user data privacy. This paper introduces the concept of ``privacy drift", an innovative framework that parallels the well-known phenomenon of concept drift. While concept drift addresses the variability in model accuracy over time due to changes in the data, privacy drift encapsulates the variation in the leakage of private information as models undergo incremental training. By defining and examining privacy drift, this study aims to unveil the nuanced relationship between the evolution of model performance and the integrity of data privacy. Through rigorous experimentation, we investigate the dynamics of privacy drift in FL systems, focusing on how model updates and data distribution shifts influence the susceptibility of models to privacy attacks, such as membership inference attacks (MIA). Our results highlight a complex interplay between model accuracy and privacy safeguards, revealing that enhancements in model performance can lead to increased privacy risks. We provide empirical evidence from experiments on customized datasets derived from CIFAR-100 (Canadian Institute for Advanced Research, 100 classes), showcasing the impact of data and concept drift on privacy. This work lays the groundwork for future research on privacy-aware machine learning, aiming to achieve a delicate balance between model accuracy and data privacy in decentralized environments.

Paper Structure

This paper contains 14 sections, 7 figures.

Figures (7)

  • Figure 1: Four-split Non-IID partitioning of CIFAR-20 Dataset.
  • Figure 2: Design of training and test sets for incremental learning.
  • Figure 3: Training accuracy, test accuracy, and MIA AUC (area under the curve) for each permutation in the uniform test paradigm, illustrating the relationship between model performance and privacy leakage.
  • Figure 4: Pearson correlation between training accuracy and MIA AUC (area under the curve), illustrating the relationship across different data distributions and testing paradigms.
  • Figure 5: Privacy drift in CIFAR-20 under the uniform test paradigm.
  • ...and 2 more figures

Theorems & Definitions (1)

  • Definition 1