Table of Contents
Fetching ...

Towards the generation of hierarchical attack models from cybersecurity vulnerabilities using language models

Kacper Sowka, Vasile Palade, Xiaorui Jiang, Hesam Jadidbonab

TL;DR

This paper investigates the use of a pre-trained language model and siamese network to discern sibling relationships between text-based cybersecurity vulnerability data and outlines how to apply this capability towards the generation of hierarchical attack models.

Abstract

This paper investigates the use of a pre-trained language model and siamese network to discern sibling relationships between text-based cybersecurity vulnerability data. The ultimate purpose of the approach presented in this paper is towards the construction of hierarchical attack models based on a set of text descriptions characterising potential/observed vulnerabilities in a given system. Due to the nature of the data, and the uncertainty sensitive environment in which the problem is presented, a practically oriented soft computing approach is necessary. Therefore, a key focus of this work is to investigate practical questions surrounding the reliability of predicted links towards the construction of such models, to which end conceptual and practical challenges and solutions associated with the proposed approach are outlined, such as dataset complexity and stability of predictions. Accordingly, the contributions of this paper focus on producing neural networks using a pre-trained language model for predicting sibling relationships between cybersecurity vulnerabilities, then outlining how to apply this capability towards the generation of hierarchical attack models. In addition, two data sampling mechanisms for tackling data complexity, and a consensus mechanism for reducing the amount of false positive predictions are outlined. Each of these approaches is compared and contrasted using empirical results from three sets of cybersecurity data to determine their effectiveness.

Towards the generation of hierarchical attack models from cybersecurity vulnerabilities using language models

TL;DR

This paper investigates the use of a pre-trained language model and siamese network to discern sibling relationships between text-based cybersecurity vulnerability data and outlines how to apply this capability towards the generation of hierarchical attack models.

Abstract

This paper investigates the use of a pre-trained language model and siamese network to discern sibling relationships between text-based cybersecurity vulnerability data. The ultimate purpose of the approach presented in this paper is towards the construction of hierarchical attack models based on a set of text descriptions characterising potential/observed vulnerabilities in a given system. Due to the nature of the data, and the uncertainty sensitive environment in which the problem is presented, a practically oriented soft computing approach is necessary. Therefore, a key focus of this work is to investigate practical questions surrounding the reliability of predicted links towards the construction of such models, to which end conceptual and practical challenges and solutions associated with the proposed approach are outlined, such as dataset complexity and stability of predictions. Accordingly, the contributions of this paper focus on producing neural networks using a pre-trained language model for predicting sibling relationships between cybersecurity vulnerabilities, then outlining how to apply this capability towards the generation of hierarchical attack models. In addition, two data sampling mechanisms for tackling data complexity, and a consensus mechanism for reducing the amount of false positive predictions are outlined. Each of these approaches is compared and contrasted using empirical results from three sets of cybersecurity data to determine their effectiveness.
Paper Structure (31 sections, 9 equations, 6 figures, 8 tables, 1 algorithm)

This paper contains 31 sections, 9 equations, 6 figures, 8 tables, 1 algorithm.

Figures (6)

  • Figure 1: Example attack tree. AND nodes are represented with lines bisecting edges between the parent and children, with OR nodes utilising standard straight lines as edges.
  • Figure 2: Visualisation of how the prediction network is designed. Encoder and predictor sections have separate hyperparameters, such as the amount of layers (m) and number of neurons (n) in each layer.
  • Figure 3: An illustration of the difference between positive and negative links between CVEs.
  • Figure 4: Clique-based sampling of negative links. Red arrows show where negative links are formed with a red "X" crossing an arrow showing a link NOT being formed. The dotted line designates the boundary of two "cliques".
  • Figure 5: Weighted random sampling of negative links. Greyed-out CVEs with dashed lines are being skipped by the sampling process.
  • ...and 1 more figures