Table of Contents
Fetching ...

Feature Selection for Network Intrusion Detection

Charles Westphal, Stephen Hailes, Mirco Musolesi

TL;DR

This work presents Feature Selection for Network Intrusion Detection (FSNID) a novel information-theoretic method that facilitates the exclusion of non-informative features when detecting network intrusions.

Abstract

Network Intrusion Detection (NID) remains a key area of research within the information security community, while also being relevant to Machine Learning (ML) practitioners. The latter generally aim to detect attacks using network features, which have been extracted from raw network data typically using dimensionality reduction methods, such as principal component analysis (PCA). However, PCA is not able to assess the relevance of features for the task at hand. Consequently, the features available are of varying quality, with some being entirely non-informative. From this, two major drawbacks arise. Firstly, trained and deployed models have to process large amounts of unnecessary data, therefore draining potentially costly resources. Secondly, the noise caused by the presence of irrelevant features can, in some cases, impede a model's ability to detect an attack. In order to deal with these challenges, we present Feature Selection for Network Intrusion Detection (FSNID) a novel information-theoretic method that facilitates the exclusion of non-informative features when detecting network intrusions. The proposed method is based on function approximation using a neural network, which enables a version of our approach that incorporates a recurrent layer. Consequently, this version uniquely enables the integration of temporal dependencies. Through an extensive set of experiments, we demonstrate that the proposed method selects a significantly reduced feature set, while maintaining NID performance. Code will be made available upon publication.

Feature Selection for Network Intrusion Detection

TL;DR

This work presents Feature Selection for Network Intrusion Detection (FSNID) a novel information-theoretic method that facilitates the exclusion of non-informative features when detecting network intrusions.

Abstract

Network Intrusion Detection (NID) remains a key area of research within the information security community, while also being relevant to Machine Learning (ML) practitioners. The latter generally aim to detect attacks using network features, which have been extracted from raw network data typically using dimensionality reduction methods, such as principal component analysis (PCA). However, PCA is not able to assess the relevance of features for the task at hand. Consequently, the features available are of varying quality, with some being entirely non-informative. From this, two major drawbacks arise. Firstly, trained and deployed models have to process large amounts of unnecessary data, therefore draining potentially costly resources. Secondly, the noise caused by the presence of irrelevant features can, in some cases, impede a model's ability to detect an attack. In order to deal with these challenges, we present Feature Selection for Network Intrusion Detection (FSNID) a novel information-theoretic method that facilitates the exclusion of non-informative features when detecting network intrusions. The proposed method is based on function approximation using a neural network, which enables a version of our approach that incorporates a recurrent layer. Consequently, this version uniquely enables the integration of temporal dependencies. Through an extensive set of experiments, we demonstrate that the proposed method selects a significantly reduced feature set, while maintaining NID performance. Code will be made available upon publication.

Paper Structure

This paper contains 42 sections, 21 equations, 6 figures, 7 tables, 1 algorithm.

Figures (6)

  • Figure 1: Diagrammatic representation of FSNID.
  • Figure 2: Comparison of the vanilla (red bars) and LSTM-based (pink bars) versions of FSNID to PI (green bars), UMFI (blue bars), CLM (brown bars), MIFA (purple bars) and LASSO (black bars). The yellow bar corresponds to a randomly selected set of features equal in size to the set of features selected using our vanilla method. From top to Bottom we present the proportion of features that were retained using each method, the accuracy achieved during the classification task using that set, the false positive rate and F1 score.
  • Figure 3: Performance comparison of FSNID against the chosen baselines in terms of their ability to deal with highly correlated features. Specifically, we plot the average MI shared between the top three features for each method with respect to each dataset.
  • Figure 4: Temporal complexity of FSNID and comparators with respect to the number of features.
  • Figure 5: In this figure we compare three neural architectures ability to incorporate temporal dependencies into the feature selection and subsequent classification tasks.
  • ...and 1 more figures