Table of Contents
Fetching ...

Support Vector Based Anomaly Detection in Federated Learning

Massimo Frasson, Dario Malchiodi

TL;DR

The paper addresses anomaly detection under privacy constraints by developing two federated, SV-based methods: Ensemble SVDD (ESVDD) and Support Vector Election (SVE). These approaches replace neural networks with SVDD-like techniques, incorporating privacy-preserving mechanisms to avoid data leakage in FL. Experimental results across multiple datasets show that SVE generally matches centralized baselines under favorable conditions and that ESVDD can outperform them but scales in model size, with anonymization having limited negative impact in most settings. The work demonstrates the practicality of SV-based anomaly detection in federated environments and sets the stage for further developments, including broader validations and formal privacy guarantees.

Abstract

Anomaly detection plays a crucial role in various domains, from cybersecurity to industrial systems. However, traditional centralized approaches often encounter challenges related to data privacy. In this context, Federated Learning emerges as a promising solution. This work introduces two innovative algorithms--Ensemble SVDD and Support Vector Election--that leverage Support Vector Machines for anomaly detection in a federated setting. In comparison with the Neural Networks typically used in within Federated Learning, these new algorithms emerge as potential alternatives, as they can operate effectively with small datasets and incur lower computational costs. The novel algorithms are tested in various distributed system configurations, yielding promising initial results that pave the way for further investigation.

Support Vector Based Anomaly Detection in Federated Learning

TL;DR

The paper addresses anomaly detection under privacy constraints by developing two federated, SV-based methods: Ensemble SVDD (ESVDD) and Support Vector Election (SVE). These approaches replace neural networks with SVDD-like techniques, incorporating privacy-preserving mechanisms to avoid data leakage in FL. Experimental results across multiple datasets show that SVE generally matches centralized baselines under favorable conditions and that ESVDD can outperform them but scales in model size, with anonymization having limited negative impact in most settings. The work demonstrates the practicality of SV-based anomaly detection in federated environments and sets the stage for further developments, including broader validations and formal privacy guarantees.

Abstract

Anomaly detection plays a crucial role in various domains, from cybersecurity to industrial systems. However, traditional centralized approaches often encounter challenges related to data privacy. In this context, Federated Learning emerges as a promising solution. This work introduces two innovative algorithms--Ensemble SVDD and Support Vector Election--that leverage Support Vector Machines for anomaly detection in a federated setting. In comparison with the Neural Networks typically used in within Federated Learning, these new algorithms emerge as potential alternatives, as they can operate effectively with small datasets and incur lower computational costs. The novel algorithms are tested in various distributed system configurations, yielding promising initial results that pave the way for further investigation.
Paper Structure (10 sections, 1 equation, 3 figures, 2 tables, 2 algorithms)

This paper contains 10 sections, 1 equation, 3 figures, 2 tables, 2 algorithms.

Figures (3)

  • Figure 1: Impact on the performance of SVE (a) and ESVDD (b) of the data split distributions for different combinations of $K$ and $F$, shown in the X axis. The chart shows mean and standard deviation on several runs of the AUC difference, with positive values favoring i.i.d. over biased splits, and vice versa.
  • Figure 2: Impact on the performance of SVE (a) and ESVDD (b) of the client fraction $F$, for different combinations of $K$ and of the data split type. Same notation as in Fig. \ref{['fig:bias']}, with positive values favoring the usage of all data over a subset, and vice versa.
  • Figure 3: Impact of the anonymisation technique on the performance of SVE (a) and ESVDD (d), for each combination of $K$, $F$ and split bias. Same setting as in the previous figures, with positive values favoring using original data over applying anonymisation.