ProveRAG: Provenance-Driven Vulnerability Analysis with Automated Retrieval-Augmented LLMs
Reza Fayyazi, Stella Hoyos Trueba, Michael Zuzak, Shanchieh Jay Yang
TL;DR
ProveRAG addresses the challenge of real-time vulnerability analysis under rapid CVE growth by coupling retrieval-augmented generation with a self-critique module and provenance tracking. It grounds responses in verifiable sources (NVD, CWE, Aqua) to reduce hallucinations and improve trustworthiness. A summarizing retrieval strategy outperforms chunking, enabling efficient, context-aware guidance for exploitation and mitigation. Across 482 critical 2024 CVEs and multiple LLMs (GPT-4o-mini and Llama-3.1-8B), ProveRAG achieves high exploitation and mitigation accuracy, with Aqua further boosting performance and provenance quality, demonstrating strong generalizability and auditability for security teams.
Abstract
In cybersecurity, security analysts constantly face the challenge of mitigating newly discovered vulnerabilities in real-time, with over 300,000 vulnerabilities identified since 1999. The sheer volume of known vulnerabilities complicates the detection of patterns for unknown threats. While LLMs can assist, they often hallucinate and lack alignment with recent threats. Over 40,000 vulnerabilities have been identified in 2024 alone, which are introduced after most popular LLMs' (e.g., GPT-5) training data cutoff. This raises a major challenge of leveraging LLMs in cybersecurity, where accuracy and up-to-date information are paramount. Therefore, we aim to improve the adaptation of LLMs in vulnerability analysis by mimicking how an analyst performs such tasks. We propose ProveRAG, an LLM-powered system designed to assist in rapidly analyzing vulnerabilities with automated retrieval augmentation of web data while self-evaluating its responses with verifiable evidence. ProveRAG incorporates a self-critique mechanism to help alleviate the omission and hallucination common in the output of LLMs applied in cybersecurity applications. The system cross-references data from verifiable sources (NVD and CWE), giving analysts confidence in the actionable insights provided. Our results indicate that ProveRAG excels in delivering verifiable evidence to the user with over 99% and 97% accuracy in exploitation and mitigation strategies, respectively. ProveRAG guides analysts to secure their systems more effectively by overcoming temporal and context-window limitations while also documenting the process for future audits.
