Transformer-based Vulnerability Detection in Code at EditTime: Zero-shot, Few-shot, or Fine-tuning?

Aaron Chan; Anant Kharkar; Roshanak Zilouchian Moghaddam; Yevhen Mohylevskyy; Alec Helyar; Eslam Kamal; Mohamed Elkamhawy; Neel Sundaresan

Transformer-based Vulnerability Detection in Code at EditTime: Zero-shot, Few-shot, or Fine-tuning?

Aaron Chan, Anant Kharkar, Roshanak Zilouchian Moghaddam, Yevhen Mohylevskyy, Alec Helyar, Eslam Kamal, Mohamed Elkamhawy, Neel Sundaresan

TL;DR

The paper tackles the problem of detecting software vulnerabilities at EditTime, when code is incomplete or being written, by leveraging transformer-based models trained on a large, diverse vulnerability dataset. It systematically compares zero-shot, few-shot, and fine-tuning strategies across CodeBERT and two code-focused LLMs, and introduces a data pipeline that synthesizes EditTime-ready contexts and vulnerable blocks. The results show a clear gain in recall and practical viability, including a production deployment as a VSCode extension and substantial vulnerability reductions when filtering code-LLM outputs. This work demonstrates that EditTime vulnerability detection can meaningfully reduce the cost and risk of vulnerabilities in both handwritten and auto-generated code, while providing a scalable path for expanding coverage and monitoring real-world impact.

Abstract

Software vulnerabilities bear enterprises significant costs. Despite extensive efforts in research and development of software vulnerability detection methods, uncaught vulnerabilities continue to put software owners and users at risk. Many current vulnerability detection methods require that code snippets can compile and build before attempting detection. This, unfortunately, introduces a long latency between the time a vulnerability is injected to the time it is removed, which can substantially increases the cost of fixing a vulnerability. We recognize that the current advances in machine learning can be used to detect vulnerable code patterns on syntactically incomplete code snippets as the developer is writing the code at EditTime. In this paper we present a practical system that leverages deep learning on a large-scale data set of vulnerable code patterns to learn complex manifestations of more than 250 vulnerability types and detect vulnerable code patterns at EditTime. We discuss zero-shot, few-shot, and fine-tuning approaches on state of the art pre-trained Large Language Models (LLMs). We show that in comparison with state of the art vulnerability detection models our approach improves the state of the art by 10%. We also evaluate our approach to detect vulnerability in auto-generated code by code LLMs. Evaluation on a benchmark of high-risk code scenarios shows a reduction of up to 90% vulnerability reduction.

Transformer-based Vulnerability Detection in Code at EditTime: Zero-shot, Few-shot, or Fine-tuning?

TL;DR

Abstract

Paper Structure (26 sections, 2 equations, 8 figures, 7 tables)

This paper contains 26 sections, 2 equations, 8 figures, 7 tables.

Introduction
Related Work
Vulnerability Detection
Deep Learning Vulnerability Detection
Vulnerability Detection in Auto-generated Code
Vulnerability Detection in Auto-generated Code
Detecting Vulnerabilities at EditTime
Data Collection
Data Pre-Processing
Models
Zero-shot Learning
Few-shot Learning
Fine-tuning
Model Variant Evaluation
Metrics
...and 11 more sections

Figures (8)

Figure 1: The best time to notify the developer about a SQL-Injection vulnerability is at EditTime, right after the developer has made the mistake.
Figure 2: A sample context and vulnerable block from our training data. In this example, the vulnerable block contains the SQL-Injection vulnerability.
Figure 3: A sample prompt created based on our template for zero-shot setting
Figure 4: A sample of CodexZero false positive due to overreach
Figure 5: A sample of CodexVuln false positive due to lack of context
...and 3 more figures

Transformer-based Vulnerability Detection in Code at EditTime: Zero-shot, Few-shot, or Fine-tuning?

TL;DR

Abstract

Transformer-based Vulnerability Detection in Code at EditTime: Zero-shot, Few-shot, or Fine-tuning?

Authors

TL;DR

Abstract

Table of Contents

Figures (8)