RAIFLE: Reconstruction Attacks on Interaction-based Federated Learning with Adversarial Data Manipulation

Dzung Pham; Shreyas Kulkarni; Amir Houmansadr

RAIFLE: Reconstruction Attacks on Interaction-based Federated Learning with Adversarial Data Manipulation

Dzung Pham, Shreyas Kulkarni, Amir Houmansadr

TL;DR

This work reveals a privacy vulnerability in interaction-based federated learning (IFL) where a central server with control over item features can actively manipulate data through Adversarial Data Manipulation (ADM) to significantly improve reconstruction of users' private interactions. The authors introduce RAIFLE, an optimization-based attack framework that outperforms gradient inversion across federated RS and OLTR settings, including image-based modalities, by jointly reconstructing interactions and, when possible, user embeddings; ADM techniques include fingerprinting and noise injection, with indirect manipulation for cases of limited feature control. They evaluate RAIFLE on RS (MovieLens-100K, Steam-200K) and OLTR (LETOR/MSLR, ImageNet-based) datasets, demonstrating strong reconstruction performance (often AUC near 0.9–1.0) and showing that standard defenses like Local DP and Secure Aggregation can be overcome under certain conditions, though at utility costs. The paper discusses countermeasures, utility implications, and practical considerations, highlighting a significant privacy risk in IFL and outlining directions for secure, private designs in RS/OLTR and broader interactive learning scenarios.

Abstract

Federated learning has emerged as a promising privacy-preserving solution for machine learning domains that rely on user interactions, particularly recommender systems and online learning to rank. While there has been substantial research on the privacy of traditional federated learning, little attention has been paid to the privacy properties of these interaction-based settings. In this work, we show that users face an elevated risk of having their private interactions reconstructed by the central server when the server can control the training features of the items that users interact with. We introduce RAIFLE, a novel optimization-based attack framework where the server actively manipulates the features of the items presented to users to increase the success rate of reconstruction. Our experiments with federated recommendation and online learning-to-rank scenarios demonstrate that RAIFLE is significantly more powerful than existing reconstruction attacks like gradient inversion, achieving high performance consistently in most settings. We discuss the pros and cons of several possible countermeasures to defend against RAIFLE in the context of interaction-based federated learning. Our code is open-sourced at https://github.com/dzungvpham/raifle.

RAIFLE: Reconstruction Attacks on Interaction-based Federated Learning with Adversarial Data Manipulation

TL;DR

Abstract

Paper Structure (80 sections, 1 theorem, 13 equations, 12 figures, 13 tables, 2 algorithms)

This paper contains 80 sections, 1 theorem, 13 equations, 12 figures, 13 tables, 2 algorithms.

Introduction
Background and Related Work
Attacks on FL
Passive Attacks
Active Attacks
Other Attacks
Privacy Defenses for FL
Differential Privacy
Secure Aggregation
Interaction-based FL (IFL)
Federated Recommender Systems
Federated Online Learning to Rank
Attacks on IFL
Overview of RAIFLE
Threat Model
...and 65 more sections

Key Result

Theorem 1

Assume that $g$ is twice-differentiable w.r.t. interactions $\mathcal{I'}$ and $\mathcal{L}_{atk}$ is the $L_2$ loss. If $\nabla_{\mathcal{I'}}^2 g = \mathbf{0}$, then RAIFLE is convex w.r.t. $\mathcal{I'}$.

Figures (12)

Figure 1: Example of an interaction reconstruction attack in federated recommendation/learning-to-rank. A malicious server may infer user interactions from the FL updates to execute targeted advertising.
Figure 2: Diagram of Interaction-based Federated Learning (IFL). Users interact with server-prepared items and train the FL model using the items and their private interactions. Users may apply privacy defense techniques such as differential privacy before sending local updates to the server.
Figure 3: Example of the fingerprinting method for a 2-layer neural net. The feature parameters consist of all connections between the inputs and the first hidden layer. Feature $x_2$ is zero-ed out, causing all feature weights corresponding to $x_2$ (dashed lines) to have 0 gradient during backpropagation.
Figure 4: Diagram of our partitioned noise injection ADM method for images. The FL server prepares two manipulated versions of an image by matching the image's extracted features to two different target noise vectors.
Figure 5: Examples of original and manipulated images from ImageNet for different vision models. Some artifacts are visible but subtle.
...and 7 more figures

Theorems & Definitions (2)

Theorem 1: Convexity of RAIFLE
proof

RAIFLE: Reconstruction Attacks on Interaction-based Federated Learning with Adversarial Data Manipulation

TL;DR

Abstract

RAIFLE: Reconstruction Attacks on Interaction-based Federated Learning with Adversarial Data Manipulation

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (12)

Theorems & Definitions (2)