Table of Contents
Fetching ...

Large Language Model for Vulnerability Detection and Repair: Literature Review and the Road Ahead

Xin Zhou, Sicong Cao, Xiaobing Sun, David Lo

TL;DR

This paper surveys the application of large language models to software vulnerability detection and repair, synthesizing 58 primary studies (2018–2024) to map which LLMs are used, how they are adapted, and what data/workflow considerations matter. It identifies three mainstream adaptation techniques for detection (finetuning, prompting, retrieval augmentation) and two for repair (finetuning, prompting), detailing data-centric, model-centric, and domain-specific strategies within each. The authors highlight key limitations—data quality, input granularity, deployment integration, robustness, and reliance on lightweight models—and offer a multi-stage roadmap toward repo-, class-, and project-level vulnerability analysis, collaboration with developers, and deployment-ready tools. The work provides guidance on creating high-quality benchmarks, employing customized LLMs, and integrating LLM-based methods into developer workflows to improve real-world vulnerability detection and repair. Overall, it establishes a foundation for advancing LLM-based vulnerability research and outlines concrete directions for future exploration and practical impact.

Abstract

The significant advancements in Large Language Models (LLMs) have resulted in their widespread adoption across various tasks within Software Engineering (SE), including vulnerability detection and repair. Numerous studies have investigated the application of LLMs to enhance vulnerability detection and repair tasks. Despite the increasing research interest, there is currently no existing survey that focuses on the utilization of LLMs for vulnerability detection and repair. In this paper, we aim to bridge this gap by offering a systematic literature review of approaches aimed at improving vulnerability detection and repair through the utilization of LLMs. The review encompasses research work from leading SE, AI, and Security conferences and journals, encompassing 43 papers published across 25 distinct venues, along with 15 high-quality preprint papers, bringing the total to 58 papers. By answering three key research questions, we aim to (1) summarize the LLMs employed in the relevant literature, (2) categorize various LLM adaptation techniques in vulnerability detection, and (3) classify various LLM adaptation techniques in vulnerability repair. Based on our findings, we have identified a series of limitations of existing studies. Additionally, we have outlined a roadmap highlighting potential opportunities that we believe are pertinent and crucial for future research endeavors.

Large Language Model for Vulnerability Detection and Repair: Literature Review and the Road Ahead

TL;DR

This paper surveys the application of large language models to software vulnerability detection and repair, synthesizing 58 primary studies (2018–2024) to map which LLMs are used, how they are adapted, and what data/workflow considerations matter. It identifies three mainstream adaptation techniques for detection (finetuning, prompting, retrieval augmentation) and two for repair (finetuning, prompting), detailing data-centric, model-centric, and domain-specific strategies within each. The authors highlight key limitations—data quality, input granularity, deployment integration, robustness, and reliance on lightweight models—and offer a multi-stage roadmap toward repo-, class-, and project-level vulnerability analysis, collaboration with developers, and deployment-ready tools. The work provides guidance on creating high-quality benchmarks, employing customized LLMs, and integrating LLM-based methods into developer workflows to improve real-world vulnerability detection and repair. Overall, it establishes a foundation for advancing LLM-based vulnerability research and outlines concrete directions for future exploration and practical impact.

Abstract

The significant advancements in Large Language Models (LLMs) have resulted in their widespread adoption across various tasks within Software Engineering (SE), including vulnerability detection and repair. Numerous studies have investigated the application of LLMs to enhance vulnerability detection and repair tasks. Despite the increasing research interest, there is currently no existing survey that focuses on the utilization of LLMs for vulnerability detection and repair. In this paper, we aim to bridge this gap by offering a systematic literature review of approaches aimed at improving vulnerability detection and repair through the utilization of LLMs. The review encompasses research work from leading SE, AI, and Security conferences and journals, encompassing 43 papers published across 25 distinct venues, along with 15 high-quality preprint papers, bringing the total to 58 papers. By answering three key research questions, we aim to (1) summarize the LLMs employed in the relevant literature, (2) categorize various LLM adaptation techniques in vulnerability detection, and (3) classify various LLM adaptation techniques in vulnerability repair. Based on our findings, we have identified a series of limitations of existing studies. Additionally, we have outlined a roadmap highlighting potential opportunities that we believe are pertinent and crucial for future research endeavors.
Paper Structure (29 sections, 8 figures, 3 tables)

This paper contains 29 sections, 8 figures, 3 tables.

Figures (8)

  • Figure 1: Structure of This Survey
  • Figure 2: Study Identification and Selection Process
  • Figure 3: Distribution of Publications per Year
  • Figure 4: Distribution of LLMs for Vulnerability Detection
  • Figure 5: Distribution of LLMs for Vulnerability Repair
  • ...and 3 more figures