SplitOut: Out-of-the-Box Training-Hijacking Detection in Split Learning via Outlier Detection
Ege Erdogan, Unat Teksen, Mehmet Salih Celiktenyildiz, Alptekin Kupcu, A. Ercument Cicek
TL;DR
SplitOut addresses training-hijacking in SplitNN by using an out-of-the-box Local Outlier Factor (LOF) detector applied to gradients gathered during a brief client training phase on a small data fraction (e.g., $1\%$). The method, enhanced by a window-based decision rule, requires minimal hyperparameter tuning and leverages the intrinsic divergence between honest and attack-driven gradient neighborhoods to achieve near-zero false positives across multiple datasets and attack variants, including adaptive multitask attackers. It demonstrates strong detection performance on MNIST, Fashion-MNIST, and CIFAR datasets, with robustness to adaptive strategies and the ability to complement existing defenses like SplitGuard. The work highlights a practical, proactive defense for privacy-preserving split learning and outlines limitations and avenues for future work, such as expanding beyond feature-space alignment attacks and addressing later-epoch attacks.
Abstract
Split learning enables efficient and privacy-aware training of a deep neural network by splitting a neural network so that the clients (data holders) compute the first layers and only share the intermediate output with the central compute-heavy server. This paradigm introduces a new attack medium in which the server has full control over what the client models learn, which has already been exploited to infer the private data of clients and to implement backdoors in the client models. Although previous work has shown that clients can successfully detect such training-hijacking attacks, the proposed methods rely on heuristics, require tuning of many hyperparameters, and do not fully utilize the clients' capabilities. In this work, we show that given modest assumptions regarding the clients' compute capabilities, an out-of-the-box outlier detection method can be used to detect existing training-hijacking attacks with almost-zero false positive rates. We conclude through experiments on different tasks that the simplicity of our approach we name \textit{SplitOut} makes it a more viable and reliable alternative compared to the earlier detection methods.
