Table of Contents
Fetching ...

Explaining Software Vulnerabilities with Large Language Models

Oshando Johnson, Alexandra Fomina, Ranjith Krishnamurthy, Vaibhav Chaudhari, Rohith Kumar Shanmuganathan, Eric Bodden

TL;DR

This work addresses the usability gap in static analysis by integrating an LLM-based explainer into an IDE, creating SAFE to produce natural-language causes, impacts, and mitigations for SAST findings. The approach parsess SARIF results, annotates them with taint-flow and CWE context, and uses a zero-shot GPT-4o prompt to generate explanations presented in a dedicated UI. Empirical evaluation shows that zero-shot prompts can effectively detect vulnerabilities across many models, while the explanations are generally faithful and helpful for beginner-to-intermediate developers, though with room for improvement in relevance and handling false positives. Overall, SAFE demonstrates that a hybrid SAST-LLM pipeline can enhance vulnerability understanding and remediation in real-world software development tasks, motivating further user studies and prompt refinements.

Abstract

The prevalence of security vulnerabilities has prompted companies to adopt static application security testing (SAST) tools for vulnerability detection. Nevertheless, these tools frequently exhibit usability limitations, as their generic warning messages do not sufficiently communicate important information to developers, resulting in misunderstandings or oversight of critical findings. In light of recent developments in Large Language Models (LLMs) and their text generation capabilities, our work investigates a hybrid approach that uses LLMs to tackle the SAST explainability challenges. In this paper, we present SAFE, an Integrated Development Environment (IDE) plugin that leverages GPT-4o to explain the causes, impacts, and mitigation strategies of vulnerabilities detected by SAST tools. Our expert user study findings indicate that the explanations generated by SAFE can significantly assist beginner to intermediate developers in understanding and addressing security vulnerabilities, thereby improving the overall usability of SAST tools.

Explaining Software Vulnerabilities with Large Language Models

TL;DR

This work addresses the usability gap in static analysis by integrating an LLM-based explainer into an IDE, creating SAFE to produce natural-language causes, impacts, and mitigations for SAST findings. The approach parsess SARIF results, annotates them with taint-flow and CWE context, and uses a zero-shot GPT-4o prompt to generate explanations presented in a dedicated UI. Empirical evaluation shows that zero-shot prompts can effectively detect vulnerabilities across many models, while the explanations are generally faithful and helpful for beginner-to-intermediate developers, though with room for improvement in relevance and handling false positives. Overall, SAFE demonstrates that a hybrid SAST-LLM pipeline can enhance vulnerability understanding and remediation in real-world software development tasks, motivating further user studies and prompt refinements.

Abstract

The prevalence of security vulnerabilities has prompted companies to adopt static application security testing (SAST) tools for vulnerability detection. Nevertheless, these tools frequently exhibit usability limitations, as their generic warning messages do not sufficiently communicate important information to developers, resulting in misunderstandings or oversight of critical findings. In light of recent developments in Large Language Models (LLMs) and their text generation capabilities, our work investigates a hybrid approach that uses LLMs to tackle the SAST explainability challenges. In this paper, we present SAFE, an Integrated Development Environment (IDE) plugin that leverages GPT-4o to explain the causes, impacts, and mitigation strategies of vulnerabilities detected by SAST tools. Our expert user study findings indicate that the explanations generated by SAFE can significantly assist beginner to intermediate developers in understanding and addressing security vulnerabilities, thereby improving the overall usability of SAST tools.

Paper Structure

This paper contains 13 sections, 3 figures, 1 table.

Figures (3)

  • Figure 1: SAFE's tool window screenshot showing the tree view () and tabbed pane () containing tabs for result details, explanations, and data-flow. The result details () and explanation () for a sample cross-site scripting vulnerability are shown.
  • Figure 2: Architecture of the SAFE Integrated Development Environment plugin for explaining static analysis tool results with large language models.
  • Figure 3: Stacked bar chart showing the evaluation of the vulnerability explanations using a Likert scale with options from very poor to very good.