Code Vulnerability Detection: A Comparative Analysis of Emerging Large Language Models

Shaznin Sultana; Sadia Afreen; Nasir U. Eisty

Code Vulnerability Detection: A Comparative Analysis of Emerging Large Language Models

Shaznin Sultana, Sadia Afreen, Nasir U. Eisty

TL;DR

This study aims to shed light on the capabilities of LLMs in vulnerability detection, contributing to the enhancement of software security practices across diverse open-source repositories.

Abstract

The growing trend of vulnerability issues in software development as a result of a large dependence on open-source projects has received considerable attention recently. This paper investigates the effectiveness of Large Language Models (LLMs) in identifying vulnerabilities within codebases, with a focus on the latest advancements in LLM technology. Through a comparative analysis, we assess the performance of emerging LLMs, specifically Llama, CodeLlama, Gemma, and CodeGemma, alongside established state-of-the-art models such as BERT, RoBERTa, and GPT-3. Our study aims to shed light on the capabilities of LLMs in vulnerability detection, contributing to the enhancement of software security practices across diverse open-source repositories. We observe that CodeGemma achieves the highest F1-score of 58\ and a Recall of 87\, amongst the recent additions of large language models to detect software security vulnerabilities.

Code Vulnerability Detection: A Comparative Analysis of Emerging Large Language Models

TL;DR

This study aims to shed light on the capabilities of LLMs in vulnerability detection, contributing to the enhancement of software security practices across diverse open-source repositories.

Abstract

Paper Structure (22 sections, 10 figures, 3 tables)

This paper contains 22 sections, 10 figures, 3 tables.

Introduction
Related Works
Related Datasets
Traditional Detection
Deep Learning based Detection
LLM based Detection
Approach
Dataset Collection and Preparation
Handling Class Imbalance
Prompt Engineering
Fine Tune Base Model
Evaluate LLMs
Compare Results with State-of-the-Art LLMs
Implementation
Experimentation Setup
...and 7 more sections

Figures (10)

Figure 1: An example of a vulnerable code from DiverseVul dataset chen2023diversevul
Figure 2: An overall solution approach
Figure 3: Dataset Visualization
Figure 4: Imbalanced Class Distribution
Figure 5: Balanced Class Distribution
...and 5 more figures

Code Vulnerability Detection: A Comparative Analysis of Emerging Large Language Models

TL;DR

Abstract

Code Vulnerability Detection: A Comparative Analysis of Emerging Large Language Models

Authors

TL;DR

Abstract

Table of Contents

Figures (10)