Table of Contents
Fetching ...

How the Graph Construction Technique Shapes Performance in IoT Botnet Detection

Hassan Wasswa, Hussein Abbass, Timothy Lynar

TL;DR

This study evaluates how the choice of the method for constructing the graph-structured dataset impacts the classification performance of a GNN model and results indicate that using Gabriel graph achieves the highest detection performance with an accuracy of 97.56%.

Abstract

The increasing incidence of IoT-based botnet attacks has driven interest in advanced learning models for detection. Recent efforts have focused on leveraging attention mechanisms to model long-range feature dependencies and Graph Neural Networks (GNNs) to capture relationships between data instances. Since GNNs require graph-structured input, tabular NetFlow data must be transformed accordingly. This study evaluates how the choice of the method for constructing the graph-structured dataset impacts the classification performance of a GNN model. Five methods--k-Nearest Neighbors, Mutual Nearest Neighbors, Shared Nearest Neighbor, Gabriel Graph, and epsilon-radius Graph--were evaluated in this research. To reduce the computational burden associated with high-dimensional data, a Variational Autoencoder (VAE) is employed to project the original features into a lower-dimensional latent space prior to graph generation. Subsequently, a Graph Attention Network (GAT) is trained on each graph to classify traffic in the N-BaIoT dataset into three categories: Normal, Mirai, and Gafgyt. The results indicate that using Gabriel graph achieves the highest detection performance with an accuracy of 97.56% while SNN recorded the lowest performance with an accuracy as low as 78.56%.

How the Graph Construction Technique Shapes Performance in IoT Botnet Detection

TL;DR

This study evaluates how the choice of the method for constructing the graph-structured dataset impacts the classification performance of a GNN model and results indicate that using Gabriel graph achieves the highest detection performance with an accuracy of 97.56%.

Abstract

The increasing incidence of IoT-based botnet attacks has driven interest in advanced learning models for detection. Recent efforts have focused on leveraging attention mechanisms to model long-range feature dependencies and Graph Neural Networks (GNNs) to capture relationships between data instances. Since GNNs require graph-structured input, tabular NetFlow data must be transformed accordingly. This study evaluates how the choice of the method for constructing the graph-structured dataset impacts the classification performance of a GNN model. Five methods--k-Nearest Neighbors, Mutual Nearest Neighbors, Shared Nearest Neighbor, Gabriel Graph, and epsilon-radius Graph--were evaluated in this research. To reduce the computational burden associated with high-dimensional data, a Variational Autoencoder (VAE) is employed to project the original features into a lower-dimensional latent space prior to graph generation. Subsequently, a Graph Attention Network (GAT) is trained on each graph to classify traffic in the N-BaIoT dataset into three categories: Normal, Mirai, and Gafgyt. The results indicate that using Gabriel graph achieves the highest detection performance with an accuracy of 97.56% while SNN recorded the lowest performance with an accuracy as low as 78.56%.
Paper Structure (15 sections, 4 equations, 3 figures)

This paper contains 15 sections, 4 equations, 3 figures.

Figures (3)

  • Figure 1: Detection framework
  • Figure 2: Classification accuracy comparison for the different graph data construction techniques
  • Figure 3: Performance comparison in terms of Precision, Recall and F1-score