Crimson: Empowering Strategic Reasoning in Cybersecurity through Large Language Models

Jiandong Jin; Bowen Tang; Mingxuan Ma; Xiao Liu; Yunfei Wang; Qingnan Lai; Jia Yang; Changling Zhou

Crimson: Empowering Strategic Reasoning in Cybersecurity through Large Language Models

Jiandong Jin, Bowen Tang, Mingxuan Ma, Xiao Liu, Yunfei Wang, Qingnan Lai, Jia Yang, Changling Zhou

TL;DR

Crimson tackles the challenge of turning unstructured vulnerability data into structured, strategic cybersecurity insights by mapping CVEs to MITRE ATT&CK techniques. The framework combines a CVE↔ATT&CK mapping dataset (CVEM), Retrieval-Aware Training (RAT) and RAT-R, and domain-specific embeddings to boost LLM-driven strategic reasoning. A 7B-parameter model fine-tuned with Crimson approaches GPT-4 performance while exhibiting fewer hallucinations and errors, and domain-tuned embeddings significantly improve technique discrimination. The work demonstrates that retrieval-aware training and targeted embedding fine-tuning can yield high-quality, interpretable CVE→ATT&CK mappings, enabling proactive defense with smaller, more efficient models.

Abstract

We introduces Crimson, a system that enhances the strategic reasoning capabilities of Large Language Models (LLMs) within the realm of cybersecurity. By correlating CVEs with MITRE ATT&CK techniques, Crimson advances threat anticipation and strategic defense efforts. Our approach includes defining and evaluating cybersecurity strategic tasks, alongside implementing a comprehensive human-in-the-loop data-synthetic workflow to develop the CVE-to-ATT&CK Mapping (CVEM) dataset. We further enhance LLMs' reasoning abilities through a novel Retrieval-Aware Training (RAT) process and its refined iteration, RAT-R. Our findings demonstrate that an LLM fine-tuned with our techniques, possessing 7 billion parameters, approaches the performance level of GPT-4, showing markedly lower rates of hallucination and errors, and surpassing other models in strategic reasoning tasks. Moreover, domain-specific fine-tuning of embedding models significantly improves performance within cybersecurity contexts, underscoring the efficacy of our methodology. By leveraging Crimson to convert raw vulnerability data into structured and actionable insights, we bolster proactive cybersecurity defenses.

Crimson: Empowering Strategic Reasoning in Cybersecurity through Large Language Models

TL;DR

Abstract

Paper Structure (29 sections, 7 figures, 3 tables, 1 algorithm)

This paper contains 29 sections, 7 figures, 3 tables, 1 algorithm.

Introduction
Integrating CVE/CTI with ATT&CK.
Advancements in LLM Reasoning.
Contributions.
Background
Vulnerability Artifacts
MITRE ATT&CK Framework
Methodology
Strategic Reasoning
Dataset Collection
Crimson
Retrieval-Aware Training (RAT) and RAT-R.
Domain-specific Embedding Model.
Strategy Measurement
F1-Score and IOU.
...and 14 more sections

Figures (7)

Figure 1: A schematic overview of the three-phase methodology employed in our research: (1) Syntehtic data curation, which involves collating CVEs and CTIs from diverse sources, aligning them for LLM-generated strategic proposals, and with human expert feedback adjudication; (2) Retrieval-aware training, where we sample human-assessed data sets, retrieve relevant ATT&CK information from a knowledge base, develop reasoning schemas, and refine explanation outputs; (3) Strategic reasoning, in which LLMs apply these schemas to deduce ATT&CK techniques from CVEs and CTIs, aiding in the anticipation of cyber threats. Further details on the process are provided in Section 3 of our paper.
Figure 2: CVE-2020-0601 (ChainOfFools), with it's related techniques and impacts.
Figure 3: CVE Mapping (CVEM) schema
Figure 4: Prompt template for RAT. Text in red indicates system-generated prompts, text in green represents user input, and text in blue signifies content retrieved by the system.
Figure 5: t-SNE Visualization of Cybersecurity Techniques with Contextual Differentiation. This figure demonstrates the embeddings for two technique pairs: T1189 and T1190, highlighted with red rectangles, alongside T1003 and T1040 as green rectangles. Rectangles without borders depict pre-contextual differentiation embeddings, while those with solid borders illustrate post-domain-specific modeling adjustments.
...and 2 more figures

Crimson: Empowering Strategic Reasoning in Cybersecurity through Large Language Models

TL;DR

Abstract

Crimson: Empowering Strategic Reasoning in Cybersecurity through Large Language Models

Authors

TL;DR

Abstract

Table of Contents

Figures (7)