Table of Contents
Fetching ...

Information Leakage from Data Updates in Machine Learning Models

Tian Hui, Farhad Farokhi, Olga Ohrimenko

TL;DR

This work studies privacy leakage when training data is updated by changing attribute values rather than adding or removing records. It develops two black-box attacks—one using only the updated model and a second comparing the original and updated models—based on prediction confidence differences to infer updated attributes or identify updated records. Empirical results on Census and LendingClub with MLP and LR models show that two-snapshot attacks yield higher leakage than using a single model, with rare attribute values and repeated updates exacerbating vulnerability. The paper discusses defenses such as batch updates and differential privacy and highlights practical implications for model maintenance under privacy regulations.

Abstract

In this paper we consider the setting where machine learning models are retrained on updated datasets in order to incorporate the most up-to-date information or reflect distribution shifts. We investigate whether one can infer information about these updates in the training data (e.g., changes to attribute values of records). Here, the adversary has access to snapshots of the machine learning model before and after the change in the dataset occurs. Contrary to the existing literature, we assume that an attribute of a single or multiple training data points are changed rather than entire data records are removed or added. We propose attacks based on the difference in the prediction confidence of the original model and the updated model. We evaluate our attack methods on two public datasets along with multi-layer perceptron and logistic regression models. We validate that two snapshots of the model can result in higher information leakage in comparison to having access to only the updated model. Moreover, we observe that data records with rare values are more vulnerable to attacks, which points to the disparate vulnerability of privacy attacks in the update setting. When multiple records with the same original attribute value are updated to the same new value (i.e., repeated changes), the attacker is more likely to correctly guess the updated values since repeated changes leave a larger footprint on the trained model. These observations point to vulnerability of machine learning models to attribute inference attacks in the update setting.

Information Leakage from Data Updates in Machine Learning Models

TL;DR

This work studies privacy leakage when training data is updated by changing attribute values rather than adding or removing records. It develops two black-box attacks—one using only the updated model and a second comparing the original and updated models—based on prediction confidence differences to infer updated attributes or identify updated records. Empirical results on Census and LendingClub with MLP and LR models show that two-snapshot attacks yield higher leakage than using a single model, with rare attribute values and repeated updates exacerbating vulnerability. The paper discusses defenses such as batch updates and differential privacy and highlights practical implications for model maintenance under privacy regulations.

Abstract

In this paper we consider the setting where machine learning models are retrained on updated datasets in order to incorporate the most up-to-date information or reflect distribution shifts. We investigate whether one can infer information about these updates in the training data (e.g., changes to attribute values of records). Here, the adversary has access to snapshots of the machine learning model before and after the change in the dataset occurs. Contrary to the existing literature, we assume that an attribute of a single or multiple training data points are changed rather than entire data records are removed or added. We propose attacks based on the difference in the prediction confidence of the original model and the updated model. We evaluate our attack methods on two public datasets along with multi-layer perceptron and logistic regression models. We validate that two snapshots of the model can result in higher information leakage in comparison to having access to only the updated model. Moreover, we observe that data records with rare values are more vulnerable to attacks, which points to the disparate vulnerability of privacy attacks in the update setting. When multiple records with the same original attribute value are updated to the same new value (i.e., repeated changes), the attacker is more likely to correctly guess the updated values since repeated changes leave a larger footprint on the trained model. These observations point to vulnerability of machine learning models to attribute inference attacks in the update setting.
Paper Structure (28 sections, 1 figure, 3 tables, 2 algorithms)

This paper contains 28 sections, 1 figure, 3 tables, 2 algorithms.

Figures (1)

  • Figure 1: In this experiment, 1000 data records with $\mathsf{MaritalStatus}$ "married" are chosen from the Census dataset; $\mathsf{MaritalStatus}$ of 100 of these records is then updated to "divorced". An attacker needs to guess which of the 1000 records hav been updated. The attacker can vary the number of their guesses by choosing the top $k$ points sorted by the model confidence difference. By changing $k$, we can vary the false positive rates.