Table of Contents
Fetching ...

Adapting Large Language Models to Emerging Cybersecurity using Retrieval Augmented Generation

Arnabh Borah, Md Tanvirul Alam, Nidhi Rastogi

TL;DR

Problem: LLMs struggle to adapt to evolving cyber threats and require reliable temporal reasoning. Approach: a retrieval-augmented generation framework combining dense and sparse retrieval plus CVE regex to ground LLMs in up-to-date CTI sources, evaluated with Llama-3-8B-Instruct on SECURE-KCV and CWET. Contributions: a novel hybrid sparse–dense retriever with regex augmentation, plus ablation analyses of temperature and embedding models and discussion of contextual-grounding strategies. Findings: the hybrid approach yields substantial accuracy gains over baselines (e.g., 62.5% KCV, 92.2% CWET; 72.7% with CVE regex on KCV) and demonstrates the importance of data quality and domain-aware retrieval. Significance: provides practical guidelines for deploying RAG in security-critical workflows with robust, up-to-date, and trustworthy cyber threat reasoning.

Abstract

Security applications are increasingly relying on large language models (LLMs) for cyber threat detection; however, their opaque reasoning often limits trust, particularly in decisions that require domain-specific cybersecurity knowledge. Because security threats evolve rapidly, LLMs must not only recall historical incidents but also adapt to emerging vulnerabilities and attack patterns. Retrieval-Augmented Generation (RAG) has demonstrated effectiveness in general LLM applications, but its potential for cybersecurity remains underexplored. In this work, we introduce a RAG-based framework designed to contextualize cybersecurity data and enhance LLM accuracy in knowledge retention and temporal reasoning. Using external datasets and the Llama-3-8B-Instruct model, we evaluate baseline RAG, an optimized hybrid retrieval approach, and conduct a comparative analysis across multiple performance metrics. Our findings highlight the promise of hybrid retrieval in strengthening the adaptability and reliability of LLMs for cybersecurity tasks.

Adapting Large Language Models to Emerging Cybersecurity using Retrieval Augmented Generation

TL;DR

Problem: LLMs struggle to adapt to evolving cyber threats and require reliable temporal reasoning. Approach: a retrieval-augmented generation framework combining dense and sparse retrieval plus CVE regex to ground LLMs in up-to-date CTI sources, evaluated with Llama-3-8B-Instruct on SECURE-KCV and CWET. Contributions: a novel hybrid sparse–dense retriever with regex augmentation, plus ablation analyses of temperature and embedding models and discussion of contextual-grounding strategies. Findings: the hybrid approach yields substantial accuracy gains over baselines (e.g., 62.5% KCV, 92.2% CWET; 72.7% with CVE regex on KCV) and demonstrates the importance of data quality and domain-aware retrieval. Significance: provides practical guidelines for deploying RAG in security-critical workflows with robust, up-to-date, and trustworthy cyber threat reasoning.

Abstract

Security applications are increasingly relying on large language models (LLMs) for cyber threat detection; however, their opaque reasoning often limits trust, particularly in decisions that require domain-specific cybersecurity knowledge. Because security threats evolve rapidly, LLMs must not only recall historical incidents but also adapt to emerging vulnerabilities and attack patterns. Retrieval-Augmented Generation (RAG) has demonstrated effectiveness in general LLM applications, but its potential for cybersecurity remains underexplored. In this work, we introduce a RAG-based framework designed to contextualize cybersecurity data and enhance LLM accuracy in knowledge retention and temporal reasoning. Using external datasets and the Llama-3-8B-Instruct model, we evaluate baseline RAG, an optimized hybrid retrieval approach, and conduct a comparative analysis across multiple performance metrics. Our findings highlight the promise of hybrid retrieval in strengthening the adaptability and reliability of LLMs for cybersecurity tasks.

Paper Structure

This paper contains 20 sections, 2 equations, 2 figures, 3 tables, 2 algorithms.

Figures (2)

  • Figure 1: General overview of each major step of the RAG framework.
  • Figure 2: General prompt format in the KCV dataset for non-RAG evaluation.