Table of Contents
Fetching ...

Improving Network Threat Detection by Knowledge Graph, Large Language Model, and Imbalanced Learning

Lili Zhang, Quanyan Zhu, Herman Ray, Ying Xie

TL;DR

This paper tackles agile network threat detection under limited historical threat data by integrating knowledge graphs, imbalanced learning, and large language models within a multi-agent framework. It combines a dynamic knowledge graph to model user activity, a weighted imbalanced classifier to detect threats, and an LLM-driven retriever/interpreter to explain risk, all operating in online sequential mode. The approach yields a 3–4% improvement in threat capture and enhanced interpretability of risk explanations, demonstrated on the CERT Insider Threat dataset with an end-to-end demo. The work advances practical threat detection by uniting structured relational modeling, rare-event handling, and natural-language interpretation for real-time security monitoring in dynamic environments.

Abstract

Network threat detection has been challenging due to the complexities of attack activities and the limitation of historical threat data to learn from. To help enhance the existing practices of using analytics, machine learning, and artificial intelligence methods to detect the network threats, we propose an integrated modelling framework, where Knowledge Graph is used to analyze the users' activity patterns, Imbalanced Learning techniques are used to prune and weigh Knowledge Graph, and LLM is used to retrieve and interpret the users' activities from Knowledge Graph. The proposed framework is applied to Agile Threat Detection through Online Sequential Learning. The preliminary results show the improved threat capture rate by 3%-4% and the increased interpretabilities of risk predictions based on the users' activities.

Improving Network Threat Detection by Knowledge Graph, Large Language Model, and Imbalanced Learning

TL;DR

This paper tackles agile network threat detection under limited historical threat data by integrating knowledge graphs, imbalanced learning, and large language models within a multi-agent framework. It combines a dynamic knowledge graph to model user activity, a weighted imbalanced classifier to detect threats, and an LLM-driven retriever/interpreter to explain risk, all operating in online sequential mode. The approach yields a 3–4% improvement in threat capture and enhanced interpretability of risk explanations, demonstrated on the CERT Insider Threat dataset with an end-to-end demo. The work advances practical threat detection by uniting structured relational modeling, rare-event handling, and natural-language interpretation for real-time security monitoring in dynamic environments.

Abstract

Network threat detection has been challenging due to the complexities of attack activities and the limitation of historical threat data to learn from. To help enhance the existing practices of using analytics, machine learning, and artificial intelligence methods to detect the network threats, we propose an integrated modelling framework, where Knowledge Graph is used to analyze the users' activity patterns, Imbalanced Learning techniques are used to prune and weigh Knowledge Graph, and LLM is used to retrieve and interpret the users' activities from Knowledge Graph. The proposed framework is applied to Agile Threat Detection through Online Sequential Learning. The preliminary results show the improved threat capture rate by 3%-4% and the increased interpretabilities of risk predictions based on the users' activities.

Paper Structure

This paper contains 26 sections, 3 theorems, 10 equations, 8 figures, 1 table.

Key Result

Theorem 1

Under standard regularity conditions (bounded features, model identifiable), the minimizer $\hat{\theta}_n$ of the empirical weighted loss is consistent: $\hat{\theta}_n \stackrel{p}{\to}\theta^*$ as $n\to\infty$. Moreover, with probability at least $1-\delta$, the uniform deviation assuming $x$ has bounded norm. Consequently, $\hat{\theta}_n$ has generalization error converging at rate $O(1/\sqr

Figures (8)

  • Figure 1: LLM Question-Answer Process
  • Figure 2: Multi-agent AI Framework of Network Threat Detection
  • Figure 3: LLM-based Knowledge Graph Retriever and Interpreter
  • Figure 4: User-Activity Knowledge Graph Schema
  • Figure 5: User CSC0217 Activity Graph and Change Score
  • ...and 3 more figures

Theorems & Definitions (6)

  • Theorem 1
  • proof
  • Theorem 2
  • proof
  • Theorem 3
  • proof