Label Inference Attacks against Node-level Vertical Federated GNNs
Marco Arazzi, Mauro Conti, Stefanos Koffas, Marina Krcek, Antonino Nocera, Stjepan Picek, Jing Xu
TL;DR
The paper investigates privacy risks in node-level vertical federated GNNs by proposing BlindSage, a zero-background-knowledge label inference attack that uses server-released gradients to recover private labels without needing prior label information or knowledge of the server architecture. BlindSage relies on reproducing an approximate server model $g'$, constructing synthetic labels $SynLabels$, and minimizing a gradient-matching loss $D = \|\nabla W - \nabla W'\|_2$, optionally using HDBSCAN to infer the number of classes; it supports online and offline execution and can adapt through pre-attacks to approximate architecture and class count. Experiments across multiple GNN architectures (GCN, GAT, GraphSAGE) and datasets (Cora, Citeseer, PubMed, Polblogs, Reddit, ArXiv) show attack accuracies at or near 100% in basic knowledge settings and consistently above 90% even with limited or no knowledge, while defenses often degrade main task performance. The work highlights significant privacy risks in VFL-GNNs, demonstrates the limited effectiveness of several defenses, and suggests defense directions such as decoupling the top model from bottom-model optimization to hinder label leakage, with future work toward graph-aware defenses and broader model families.
Abstract
Federated learning enables collaborative training of machine learning models by keeping the raw data of the involved workers private. Three of its main objectives are to improve the models' privacy, security, and scalability. Vertical Federated Learning (VFL) offers an efficient cross-silo setting where a few parties collaboratively train a model without sharing the same features. In such a scenario, classification labels are commonly considered sensitive information held exclusively by one (active) party, while other (passive) parties use only their local information. Recent works have uncovered important flaws of VFL, leading to possible label inference attacks under the assumption that the attacker has some, even limited, background knowledge on the relation between labels and data. In this work, we are the first (to the best of our knowledge) to investigate label inference attacks on VFL using a zero-background knowledge strategy. To formulate our proposal, we focus on Graph Neural Networks (GNNs) as a target model for the underlying VFL. In particular, we refer to node classification tasks, which are widely studied, and GNNs have shown promising results. Our proposed attack, BlindSage, provides impressive results in the experiments, achieving nearly 100% accuracy in most cases. Even when the attacker has no information about the used architecture or the number of classes, the accuracy remains above 90% in most instances. Finally, we observe that well-known defenses cannot mitigate our attack without affecting the model's performance on the main classification task.
