Table of Contents
Fetching ...

Anomaly-Flow: A Multi-domain Federated Generative Adversarial Network for Distributed Denial-of-Service Detection

Leonardo Henrique de Melo, Gustavo de Carvalho Bertoli, Michele Nogueira, Aldri Luiz dos Santos, Lourenço Alves Pereira Junior

TL;DR

Anomaly-Flow proposes a privacy-preserving approach to DDoS detection across multiple network domains by integrating Federated Learning with GAN-based synthetic data (GANomaly) to enable cross-domain learning and external-model sharing without exposing raw data. The framework trains locally in silos, aggregates via FedAvg, and then uses the global model to generate synthetic benign data to train heterogeneous models for external entities, addressing data privacy while improving generalization. Evaluated on three NetFlow datasets, it achieves an average F1-score of $0.747$ after $10$ federated rounds, and demonstrates competitive cross-domain performance and the potential for transferring learned DDoS patterns to unseen domains through synthetic data. The work highlights challenges in data quality, domain generalization, and deployment, and points to opportunities such as adaptive thresholds and standardized cross-domain evaluation to advance practical, privacy-preserving network defense.

Abstract

Distributed denial-of-service (DDoS) attacks remain a critical threat to Internet services, causing costly disruptions. While machine learning (ML) has shown promise in DDoS detection, current solutions struggle with multi-domain environments where attacks must be detected across heterogeneous networks and organizational boundaries. This limitation severely impacts the practical deployment of ML-based defenses in real-world settings. This paper introduces Anomaly-Flow, a novel framework that addresses this critical gap by combining Federated Learning (FL) with Generative Adversarial Networks (GANs) for privacy-preserving, multi-domain DDoS detection. Our proposal enables collaborative learning across diverse network domains while preserving data privacy through synthetic flow generation. Through extensive evaluation across three distinct network datasets, Anomaly-Flow achieves an average F1-score of $0.747$, outperforming baseline models. Importantly, our framework enables organizations to share attack detection capabilities without exposing sensitive network data, making it particularly valuable for critical infrastructure and privacy-sensitive sectors. Beyond immediate technical contributions, this work provides insights into the challenges and opportunities in multi-domain DDoS detection, establishing a foundation for future research in collaborative network defense systems. Our findings have important implications for academic research and industry practitioners working to deploy practical ML-based security solutions.

Anomaly-Flow: A Multi-domain Federated Generative Adversarial Network for Distributed Denial-of-Service Detection

TL;DR

Anomaly-Flow proposes a privacy-preserving approach to DDoS detection across multiple network domains by integrating Federated Learning with GAN-based synthetic data (GANomaly) to enable cross-domain learning and external-model sharing without exposing raw data. The framework trains locally in silos, aggregates via FedAvg, and then uses the global model to generate synthetic benign data to train heterogeneous models for external entities, addressing data privacy while improving generalization. Evaluated on three NetFlow datasets, it achieves an average F1-score of after federated rounds, and demonstrates competitive cross-domain performance and the potential for transferring learned DDoS patterns to unseen domains through synthetic data. The work highlights challenges in data quality, domain generalization, and deployment, and points to opportunities such as adaptive thresholds and standardized cross-domain evaluation to advance practical, privacy-preserving network defense.

Abstract

Distributed denial-of-service (DDoS) attacks remain a critical threat to Internet services, causing costly disruptions. While machine learning (ML) has shown promise in DDoS detection, current solutions struggle with multi-domain environments where attacks must be detected across heterogeneous networks and organizational boundaries. This limitation severely impacts the practical deployment of ML-based defenses in real-world settings. This paper introduces Anomaly-Flow, a novel framework that addresses this critical gap by combining Federated Learning (FL) with Generative Adversarial Networks (GANs) for privacy-preserving, multi-domain DDoS detection. Our proposal enables collaborative learning across diverse network domains while preserving data privacy through synthetic flow generation. Through extensive evaluation across three distinct network datasets, Anomaly-Flow achieves an average F1-score of , outperforming baseline models. Importantly, our framework enables organizations to share attack detection capabilities without exposing sensitive network data, making it particularly valuable for critical infrastructure and privacy-sensitive sectors. Beyond immediate technical contributions, this work provides insights into the challenges and opportunities in multi-domain DDoS detection, establishing a foundation for future research in collaborative network defense systems. Our findings have important implications for academic research and industry practitioners working to deploy practical ML-based security solutions.

Paper Structure

This paper contains 12 sections, 4 figures, 2 tables.

Figures (4)

  • Figure 1: Use Case Training Diagram using Generative models in a FL schema ① FL using large datasets and the GANomaly model; ② Generation of synthetic data using the global model trained in the FL schema; ③ Use the synthetic data generated by the generative model to train heterogeneous models that the FL participants and third-party entities can use.
  • Figure 2: Class distribution and imbalance across analyzed datasets for DDoS detection, highlighting the skewness in class proportions and sample quantities. The benign class dominates in CICIDS-2018 and TON-IoT, whereas DDoS traffic overwhelmingly prevails in Bot-IoT, illustrating significant disparities in data distribution across datasets.
  • Figure 3: (a) The diagram presents the structure of the data split for the training and test. Initially, model A is trained on the train split of dataset A and then evaluated with the test set from the same dataset used in training, referred to as local evaluation. Next, (b) presents the cross-evaluation procedure, in which model A, previously trained on dataset A, is evaluated with data from a different dataset, in this example, dataset B.
  • Figure 4: Average F1-score for baseline algorithms, showing their generalization performance when trained on one dataset and tested on the other two in the context of DDoS detection. The results reveal significant variability in performance across models, with most algorithms achieving low scores, indicating poor cross-dataset generalization. Notably, the Energy Flow Classifier (EFC) substantially outperforms others, suggesting its robustness in diverse network scenarios. These findings emphasize the challenges of deploying machine learning models in heterogeneous environments and underscore the importance of designing algorithms capable of handling such variability.