PLLM-CS: Pre-trained Large Language Model (LLM) for Cyber Threat Detection in Satellite Networks

Mohammed Hassanin; Marwa Keshk; Sara Salim; Majid Alsubaie; Dharmendra Sharma

PLLM-CS: Pre-trained Large Language Model (LLM) for Cyber Threat Detection in Satellite Networks

Mohammed Hassanin, Marwa Keshk, Sara Salim, Majid Alsubaie, Dharmendra Sharma

TL;DR

PLLM-CS introduces a Pre-trained Large Language Model–based intrusion detection system tailored for satellite networks by reinterpreting multivariate cyber-data as token-like inputs and applying transformer encoders with a binary classification head. The method employs a sentencing-based pre-processing, patch embedding, and self-attention to capture long-range dependencies, optimized with binary cross-entropy loss. Empirical results on UNSW-NB-15 and TON_IoT show PLLM-CS outperforming traditional ML and DL baselines, achieving 100% accuracy on UNSW-NB-15 and strong performance on TON_IoT, validating the approach as a robust approach for SSN defenses. The work highlights the practical significance of transformer-based contextual representation for onboard cyber defense and motivates development of SSN-specific datasets and energy-efficient implementations.

Abstract

Satellite networks are vital in facilitating communication services for various critical infrastructures. These networks can seamlessly integrate with a diverse array of systems. However, some of these systems are vulnerable due to the absence of effective intrusion detection systems, which can be attributed to limited research and the high costs associated with deploying, fine-tuning, monitoring, and responding to security breaches. To address these challenges, we propose a pretrained Large Language Model for Cyber Security , for short PLLM-CS, which is a variant of pre-trained Transformers [1], which includes a specialized module for transforming network data into contextually suitable inputs. This transformation enables the proposed LLM to encode contextual information within the cyber data. To validate the efficacy of the proposed method, we conducted empirical experiments using two publicly available network datasets, UNSW_NB 15 and TON_IoT, both providing Internet of Things (IoT)-based traffic data. Our experiments demonstrate that proposed LLM method outperforms state-of-the-art techniques such as BiLSTM, GRU, and CNN. Notably, the PLLM-CS method achieves an outstanding accuracy level of 100% on the UNSW_NB 15 dataset, setting a new standard for benchmark performance in this domain.

PLLM-CS: Pre-trained Large Language Model (LLM) for Cyber Threat Detection in Satellite Networks

TL;DR

Abstract

Paper Structure (13 sections, 12 equations, 6 figures, 4 tables)

This paper contains 13 sections, 12 equations, 6 figures, 4 tables.

Introduction
Related Works
Method
Pre-Processing
Pre-Trained Large Language Transformer Model
Classification head
Loss function
Experiments
Experimental Setup
Datasets
Baselines
Evaluation protocols
Conclusion and future work

Figures (6)

Figure 1: The visualization of the proposed method, PLLM-CS, as an IDS with the satellite network. PLLM-CS will be an embedded AI-processing unit such as Nvidia Jetson, plugged into the satellite.
Figure 2: Visual explanation for PLLM-CS components. Firstly, the input is forwarded to the sentencing step to facilitate context encoding by MHA.
Figure 3: Visual comparisons between PLLM-CS, CNN, and BiLSTM on the training (left) and testing (right) phases on the UNSW NB-15 dataset. The proposed method's accuracy is higher than the baselines.
Figure 4: Visual comparisons of losses for PLLM-CS, CNN, and BiLSTM on the training (left) and testing (right) phases on the UNSW NB-15 dataset. The proposed method shows stable convergence.
Figure 5: Visual comparisons between PLLM-CS, GRU, and LSTM on the training (left) and testing (right) phases on the TON_IoT dataset. The proposed method's accuracy is higher than the baselines.
...and 1 more figures

PLLM-CS: Pre-trained Large Language Model (LLM) for Cyber Threat Detection in Satellite Networks

TL;DR

Abstract

PLLM-CS: Pre-trained Large Language Model (LLM) for Cyber Threat Detection in Satellite Networks

Authors

TL;DR

Abstract

Table of Contents

Figures (6)