Exploiting Unintended Feature Leakage in Collaborative Learning
Luca Melis, Congzheng Song, Emiliano De Cristofaro, Vitaly Shmatikov
TL;DR
The paper analyzes privacy risks in collaborative and federated learning where only model updates are exchanged. It introduces passive and active inference attacks that reveal membership and subset properties of participants’ data by exploiting gradient and embedding-layer information. Across two-party and multi-party settings, the attacks demonstrate strong leakage, including dynamic properties and person-specific inferences, with defenses proving largely ineffective in practical regimes. The findings underscore the need for new privacy-preserving techniques and detection mechanisms tailored to collaborative learning environments.
Abstract
Collaborative machine learning and related techniques such as federated learning allow multiple participants, each with his own training dataset, to build a joint model by training locally and periodically exchanging model updates. We demonstrate that these updates leak unintended information about participants' training data and develop passive and active inference attacks to exploit this leakage. First, we show that an adversarial participant can infer the presence of exact data points -- for example, specific locations -- in others' training data (i.e., membership inference). Then, we show how this adversary can infer properties that hold only for a subset of the training data and are independent of the properties that the joint model aims to capture. For example, he can infer when a specific person first appears in the photos used to train a binary gender classifier. We evaluate our attacks on a variety of tasks, datasets, and learning configurations, analyze their limitations, and discuss possible defenses.
