Table of Contents
Fetching ...

CyberRAG: An Agentic RAG cyber attack classification and reporting tool

Francesco Blefari, Cristian Cosentino, Francesco Aurelio Pironti, Angelo Furfaro, Fabrizio Marozzo

TL;DR

CyberRAG tackles the overload and limited interpretability of IDS/IPS alerts by introducing a modular agent-based RAG framework for real-time cyber-attack classification, explanation, and reporting. It uses a central LLM to orchestrate fine-tuned classifiers for attack families and a multi-pass retrieval loop over a domain-specific knowledge base to justify predictions. The system achieves over 94% accuracy per class and 94.92% final accuracy, with explanations scoring 0.94 on BERTScore and 4.9/5 from a GPT-4-based expert judge, showing robustness to adversarial and unseen payloads. This work demonstrates that agentic, specialist-oriented RAG can deliver trustworthy, SOC-ready prose while maintaining high detection performance and adaptability for partially automated cyber defense.

Abstract

Intrusion Detection and Prevention Systems (IDS/IPS) in large enterprises can generate hundreds of thousands of alerts per hour, overwhelming analysts with logs requiring rapidly evolving expertise. Conventional machine-learning detectors reduce alert volume but still yield many false positives, while standard Retrieval-Augmented Generation (RAG) pipelines often retrieve irrelevant context and fail to justify predictions. We present CyberRAG, a modular agent-based RAG framework that delivers real-time classification, explanation, and structured reporting for cyber-attacks. A central LLM agent orchestrates: (i) fine-tuned classifiers specialized by attack family; (ii) tool adapters for enrichment and alerting; and (iii) an iterative retrieval-and-reason loop that queries a domain-specific knowledge base until evidence is relevant and self-consistent. Unlike traditional RAG, CyberRAG adopts an agentic design that enables dynamic control flow and adaptive reasoning. This architecture autonomously refines threat labels and natural-language justifications, reducing false positives and enhancing interpretability. It is also extensible: new attack types can be supported by adding classifiers without retraining the core agent. CyberRAG was evaluated on SQL Injection, XSS, and SSTI, achieving over 94\% accuracy per class and a final classification accuracy of 94.92\% through semantic orchestration. Generated explanations reached 0.94 in BERTScore and 4.9/5 in GPT-4-based expert evaluation, with robustness preserved against adversarial and unseen payloads. These results show that agentic, specialist-oriented RAG can combine high detection accuracy with trustworthy, SOC-ready prose, offering a flexible path toward partially automated cyber-defense workflows.

CyberRAG: An Agentic RAG cyber attack classification and reporting tool

TL;DR

CyberRAG tackles the overload and limited interpretability of IDS/IPS alerts by introducing a modular agent-based RAG framework for real-time cyber-attack classification, explanation, and reporting. It uses a central LLM to orchestrate fine-tuned classifiers for attack families and a multi-pass retrieval loop over a domain-specific knowledge base to justify predictions. The system achieves over 94% accuracy per class and 94.92% final accuracy, with explanations scoring 0.94 on BERTScore and 4.9/5 from a GPT-4-based expert judge, showing robustness to adversarial and unseen payloads. This work demonstrates that agentic, specialist-oriented RAG can deliver trustworthy, SOC-ready prose while maintaining high detection performance and adaptability for partially automated cyber defense.

Abstract

Intrusion Detection and Prevention Systems (IDS/IPS) in large enterprises can generate hundreds of thousands of alerts per hour, overwhelming analysts with logs requiring rapidly evolving expertise. Conventional machine-learning detectors reduce alert volume but still yield many false positives, while standard Retrieval-Augmented Generation (RAG) pipelines often retrieve irrelevant context and fail to justify predictions. We present CyberRAG, a modular agent-based RAG framework that delivers real-time classification, explanation, and structured reporting for cyber-attacks. A central LLM agent orchestrates: (i) fine-tuned classifiers specialized by attack family; (ii) tool adapters for enrichment and alerting; and (iii) an iterative retrieval-and-reason loop that queries a domain-specific knowledge base until evidence is relevant and self-consistent. Unlike traditional RAG, CyberRAG adopts an agentic design that enables dynamic control flow and adaptive reasoning. This architecture autonomously refines threat labels and natural-language justifications, reducing false positives and enhancing interpretability. It is also extensible: new attack types can be supported by adding classifiers without retraining the core agent. CyberRAG was evaluated on SQL Injection, XSS, and SSTI, achieving over 94\% accuracy per class and a final classification accuracy of 94.92\% through semantic orchestration. Generated explanations reached 0.94 in BERTScore and 4.9/5 in GPT-4-based expert evaluation, with robustness preserved against adversarial and unseen payloads. These results show that agentic, specialist-oriented RAG can combine high detection accuracy with trustworthy, SOC-ready prose, offering a flexible path toward partially automated cyber-defense workflows.

Paper Structure

This paper contains 36 sections, 2 equations, 5 figures, 6 tables.

Figures (5)

  • Figure 1: Structure of a multimodal AI Agent: integrates perception, memory, and planning to interact with the environment and perform intelligent actions.
  • Figure 2: CyberRAG system architecture: the user interacts with a chatbot connected to the webserver, while the IDS detects attacks from the Internet.
  • Figure 3: Classification performance of BERT on different web vulnerabilities using attack-specific training.
  • Figure 4: Comparison between explanations generated with and without retrieval. LLM-based scoring shows consistent advantage from RAG-enhanced generation.
  • Figure 5: Evaluation of model robustness based on the percentage of correct classifications under two conditions: (i) Adversarial Examples, where inputs are perturbed to simulate evasion attacks, and (ii) Out-of-Distribution (OOD) Inputs, representing unseen attack categories. The metric Correct Classification (%) reflects the number of accurate predictions out of 100 test cases for each scenario.