Table of Contents
Fetching ...

ChatNVD: Advancing Cybersecurity Vulnerability Assessment with Large Language Models

Shivansh Chopra, Hussain Ahmad, Diksha Goel, Claudia Szabo

TL;DR

ChatNVD presents an LLM-driven vulnerability assessment tool that leverages NVD data to generate contextual, accessible vulnerability analyses. The study compares GPT-4o Mini, LLaMA 3, and Gemini 1.5 Pro using a TF-IDF embedding pipeline and an evaluation on CVE-based queries, finding GPT-4o Mini achieves the highest accuracy with strong robustness across input sizes and question types ($>0.92$). The work demonstrates practical deployment through a FastAPI–AWS–React stack and provides a framework for domain-specific evaluation and prompt design in cybersecurity tasks. It also discusses reliability, cost, and scalability considerations essential for real-world adoption and future research directions in vulnerability assessment using LLMs.

Abstract

The increasing frequency and sophistication of cybersecurity vulnerabilities in software systems underscores the need for more robust and effective vulnerability assessment methods. However, existing approaches often rely on highly technical and abstract frameworks, which hinder understanding and increase the likelihood of exploitation, resulting in severe cyberattacks. In this paper, we introduce ChatNVD, a support tool powered by Large Language Models (LLMs) that leverages the National Vulnerability Database (NVD) to generate accessible, context-rich summaries of software vulnerabilities. We develop three variants of ChatNVD, utilizing three prominent LLMs: GPT-4o Mini by OpenAI, LLaMA 3 by Meta, and Gemini 1.5 Pro by Google. To evaluate their performance, we conduct a comparative evaluation focused on their ability to identify, interpret, and explain software vulnerabilities. Our results demonstrate that GPT-4o Mini outperforms the other models, achieving over 92% accuracy and the lowest error rates, making it the most reliable option for real-world vulnerability assessment.

ChatNVD: Advancing Cybersecurity Vulnerability Assessment with Large Language Models

TL;DR

ChatNVD presents an LLM-driven vulnerability assessment tool that leverages NVD data to generate contextual, accessible vulnerability analyses. The study compares GPT-4o Mini, LLaMA 3, and Gemini 1.5 Pro using a TF-IDF embedding pipeline and an evaluation on CVE-based queries, finding GPT-4o Mini achieves the highest accuracy with strong robustness across input sizes and question types (). The work demonstrates practical deployment through a FastAPI–AWS–React stack and provides a framework for domain-specific evaluation and prompt design in cybersecurity tasks. It also discusses reliability, cost, and scalability considerations essential for real-world adoption and future research directions in vulnerability assessment using LLMs.

Abstract

The increasing frequency and sophistication of cybersecurity vulnerabilities in software systems underscores the need for more robust and effective vulnerability assessment methods. However, existing approaches often rely on highly technical and abstract frameworks, which hinder understanding and increase the likelihood of exploitation, resulting in severe cyberattacks. In this paper, we introduce ChatNVD, a support tool powered by Large Language Models (LLMs) that leverages the National Vulnerability Database (NVD) to generate accessible, context-rich summaries of software vulnerabilities. We develop three variants of ChatNVD, utilizing three prominent LLMs: GPT-4o Mini by OpenAI, LLaMA 3 by Meta, and Gemini 1.5 Pro by Google. To evaluate their performance, we conduct a comparative evaluation focused on their ability to identify, interpret, and explain software vulnerabilities. Our results demonstrate that GPT-4o Mini outperforms the other models, achieving over 92% accuracy and the lowest error rates, making it the most reliable option for real-world vulnerability assessment.

Paper Structure

This paper contains 32 sections, 16 figures, 3 tables.

Figures (16)

  • Figure 1: Growth Trend in Reported Software Vulnerabilities guo2024outside.
  • Figure 2: High-Level View of Transformer Architecture Underlying Modern LLMs ahmed2023chatgpt
  • Figure 3: Proposed Architecture of ChatNVD
  • Figure 4: Phased Workflow of ChatNVD Research Methodology.
  • Figure 5: Prompt template for generating responses in Fig. \ref{['fig:gemini response']}.
  • ...and 11 more figures