Table of Contents
Fetching ...

Hybrid Machine Learning Models for Intrusion Detection in IoT: Leveraging a Real-World IoT Dataset

Md Ahnaf Akif, Ismail Butun, Andre Williams, Imadeldin Mahgoub

TL;DR

This work tackles intrusion detection in IoT networks by implementing hybrid voting classifiers that fuse RF, XGBoost, KNN (binary) and RF, XGBoost, AdaBoost (multi-class) and evaluating them on the IoT-23 benchmark. The approach leverages ensemble strengths to improve accuracy and scalability, reporting near-perfect binary performance (around 99.99%) and about 99% metrics for multi-class tasks. A comprehensive data-preprocessing pipeline—including missing-value imputation, one-hot encoding, feature engineering, and careful data-splitting—supports robust learning, while careful handling of data volume, class imbalance, and information leakage enhances generalization. The results underscore the practical potential of robust, scalable IDS frameworks for real-world IoT security, with future work addressing real-time deployment, further feature selection, threat adaptation, and explainable AI.

Abstract

The rapid growth of the Internet of Things (IoT) has revolutionized industries, enabling unprecedented connectivity and functionality. However, this expansion also increases vulnerabilities, exposing IoT networks to increasingly sophisticated cyberattacks. Intrusion Detection Systems (IDS) are crucial for mitigating these threats, and recent advancements in Machine Learning (ML) offer promising avenues for improvement. This research explores a hybrid approach, combining several standalone ML models such as Random Forest (RF), XGBoost, K-Nearest Neighbors (KNN), and AdaBoost, in a voting-based hybrid classifier for effective IoT intrusion detection. This ensemble method leverages the strengths of individual algorithms to enhance accuracy and address challenges related to data complexity and scalability. Using the widely-cited IoT-23 dataset, a prominent benchmark in IoT cybersecurity research, we evaluate our hybrid classifiers for both binary and multi-class intrusion detection problems, ensuring a fair comparison with existing literature. Results demonstrate that our proposed hybrid models, designed for robustness and scalability, outperform standalone approaches in IoT environments. This work contributes to the development of advanced, intelligent IDS frameworks capable of addressing evolving cyber threats.

Hybrid Machine Learning Models for Intrusion Detection in IoT: Leveraging a Real-World IoT Dataset

TL;DR

This work tackles intrusion detection in IoT networks by implementing hybrid voting classifiers that fuse RF, XGBoost, KNN (binary) and RF, XGBoost, AdaBoost (multi-class) and evaluating them on the IoT-23 benchmark. The approach leverages ensemble strengths to improve accuracy and scalability, reporting near-perfect binary performance (around 99.99%) and about 99% metrics for multi-class tasks. A comprehensive data-preprocessing pipeline—including missing-value imputation, one-hot encoding, feature engineering, and careful data-splitting—supports robust learning, while careful handling of data volume, class imbalance, and information leakage enhances generalization. The results underscore the practical potential of robust, scalable IDS frameworks for real-world IoT security, with future work addressing real-time deployment, further feature selection, threat adaptation, and explainable AI.

Abstract

The rapid growth of the Internet of Things (IoT) has revolutionized industries, enabling unprecedented connectivity and functionality. However, this expansion also increases vulnerabilities, exposing IoT networks to increasingly sophisticated cyberattacks. Intrusion Detection Systems (IDS) are crucial for mitigating these threats, and recent advancements in Machine Learning (ML) offer promising avenues for improvement. This research explores a hybrid approach, combining several standalone ML models such as Random Forest (RF), XGBoost, K-Nearest Neighbors (KNN), and AdaBoost, in a voting-based hybrid classifier for effective IoT intrusion detection. This ensemble method leverages the strengths of individual algorithms to enhance accuracy and address challenges related to data complexity and scalability. Using the widely-cited IoT-23 dataset, a prominent benchmark in IoT cybersecurity research, we evaluate our hybrid classifiers for both binary and multi-class intrusion detection problems, ensuring a fair comparison with existing literature. Results demonstrate that our proposed hybrid models, designed for robustness and scalability, outperform standalone approaches in IoT environments. This work contributes to the development of advanced, intelligent IDS frameworks capable of addressing evolving cyber threats.

Paper Structure

This paper contains 24 sections, 4 equations, 8 figures, 2 tables.

Figures (8)

  • Figure 1: Feature importance of binary classification
  • Figure 2: Feature importance of multi-class classification
  • Figure 3: Flow Chart Representation of Our Proposed Hybrid Model Algorithm for Binary Classification
  • Figure 4: Flow Chart Representation of Our Proposed Hybrid Model Algorithm for Multi-Class Classification
  • Figure 5: Confusion Matrix of Binary Classification
  • ...and 3 more figures