Table of Contents
Fetching ...

On Selecting Few-Shot Examples for LLM-based Code Vulnerability Detection

Md Abdul Hannan, Ronghao Ni, Chi Zhang, Limin Jia, Ravi Mangal, Corina S. Pasareanu

TL;DR

This work tackles the problem of improving code vulnerability detection with LLMs by focusing on strategic few-shot example selection. It introduces two primitives, Learn-from-Mistakes ($LFM$) and Learn-from-Nearest-Neighbors ($LFNN$), and proposes three methods to combine them, forming query-specific few-shot sets. Across diverse datasets and multiple models, the combination strategies especially enhance robustness and F1-scores, with performance sometimes rivaling a strong closed-model baseline. The findings highlight the importance of balanced, prompt-optimized in-context learning for practical vulnerability detection using open-source LLMs.

Abstract

Large language models (LLMs) have demonstrated impressive capabilities for many coding tasks, including summarization, translation, completion, and code generation. However, detecting code vulnerabilities remains a challenging task for LLMs. An effective way to improve LLM performance is in-context learning (ICL) - providing few-shot examples similar to the query, along with correct answers, can improve an LLM's ability to generate correct solutions. However, choosing the few-shot examples appropriately is crucial to improving model performance. In this paper, we explore two criteria for choosing few-shot examples for ICL used in the code vulnerability detection task. The first criterion considers if the LLM (consistently) makes a mistake or not on a sample with the intuition that LLM performance on a sample is informative about its usefulness as a few-shot example. The other criterion considers similarity of the examples with the program under query and chooses few-shot examples based on the $k$-nearest neighbors to the given sample. We perform evaluations to determine the benefits of these criteria individually as well as under various combinations, using open-source models on multiple datasets.

On Selecting Few-Shot Examples for LLM-based Code Vulnerability Detection

TL;DR

This work tackles the problem of improving code vulnerability detection with LLMs by focusing on strategic few-shot example selection. It introduces two primitives, Learn-from-Mistakes () and Learn-from-Nearest-Neighbors (), and proposes three methods to combine them, forming query-specific few-shot sets. Across diverse datasets and multiple models, the combination strategies especially enhance robustness and F1-scores, with performance sometimes rivaling a strong closed-model baseline. The findings highlight the importance of balanced, prompt-optimized in-context learning for practical vulnerability detection using open-source LLMs.

Abstract

Large language models (LLMs) have demonstrated impressive capabilities for many coding tasks, including summarization, translation, completion, and code generation. However, detecting code vulnerabilities remains a challenging task for LLMs. An effective way to improve LLM performance is in-context learning (ICL) - providing few-shot examples similar to the query, along with correct answers, can improve an LLM's ability to generate correct solutions. However, choosing the few-shot examples appropriately is crucial to improving model performance. In this paper, we explore two criteria for choosing few-shot examples for ICL used in the code vulnerability detection task. The first criterion considers if the LLM (consistently) makes a mistake or not on a sample with the intuition that LLM performance on a sample is informative about its usefulness as a few-shot example. The other criterion considers similarity of the examples with the program under query and chooses few-shot examples based on the -nearest neighbors to the given sample. We perform evaluations to determine the benefits of these criteria individually as well as under various combinations, using open-source models on multiple datasets.

Paper Structure

This paper contains 23 sections, 3 tables, 3 algorithms.