EaTVul: ChatGPT-based Evasion Attack Against Software Vulnerability Detection

Shigang Liu; Di Cao; Junae Kim; Tamas Abraham; Paul Montague; Seyit Camtepe; Jun Zhang; Yang Xiang

EaTVul: ChatGPT-based Evasion Attack Against Software Vulnerability Detection

Shigang Liu, Di Cao, Junae Kim, Tamas Abraham, Paul Montague, Seyit Camtepe, Jun Zhang, Yang Xiang

TL;DR

EaTVul addresses the vulnerability of ML-based software vulnerability detectors to adversarial evasion by presenting a black-box attack that combines SVM-driven important-sample identification, attention-based feature extraction, ChatGPT-generated adversarial data, and a fuzzy genetic algorithm for seed selection. The two-phase framework first generates adversarial data and then learns to insert optimized snippets into vulnerable samples to flip predictions, achieving high attack success rates across multiple datasets and languages, with data optimization further boosting effectiveness. Key contributions include a novel two-phase EaTVul pipeline, a reproducible adversarial data generation process using ChatGPT, and a fuzzy GA strategy for seed selection that outperforms random baselines. These findings highlight the need for robust defenses against adversarial manipulation in software vulnerability detection and motivate future work on defense mechanisms and cross-language resilience.

Abstract

Recently, deep learning has demonstrated promising results in enhancing the accuracy of vulnerability detection and identifying vulnerabilities in software. However, these techniques are still vulnerable to attacks. Adversarial examples can exploit vulnerabilities within deep neural networks, posing a significant threat to system security. This study showcases the susceptibility of deep learning models to adversarial attacks, which can achieve 100% attack success rate (refer to Table 5). The proposed method, EaTVul, encompasses six stages: identification of important samples using support vector machines, identification of important features using the attention mechanism, generation of adversarial data based on these features using ChatGPT, preparation of an adversarial attack pool, selection of seed data using a fuzzy genetic algorithm, and the execution of an evasion attack. Extensive experiments demonstrate the effectiveness of EaTVul, achieving an attack success rate of more than 83% when the snippet size is greater than 2. Furthermore, in most cases with a snippet size of 4, EaTVul achieves a 100% attack success rate. The findings of this research emphasize the necessity of robust defenses against adversarial attacks in software vulnerability detection.

EaTVul: ChatGPT-based Evasion Attack Against Software Vulnerability Detection

TL;DR

Abstract

Paper Structure (19 sections, 9 equations, 6 figures, 8 tables)

This paper contains 19 sections, 9 equations, 6 figures, 8 tables.

Introduction
Related Work
Overview of the EaTVul
Adversarial Data Generation
Important Samples Identification using SVM
Important Feature Identification
Adversarial Data Generation using ChatGPT
Preserved Attack Pool Generation
Adversarial Learning
Seed Data Selection using FGA
Evasion Attack
Visualization of EaTVul evasion attack
Experimental Setup
Datasets
Evaluation Metrics
...and 4 more sections

Figures (6)

Figure 1: This figure shows that a vulnerable sample can be easily bypassed by adding a precisely crafted piece of adversarial data. Left: a vulnerable function predicted as vulnerable with a high probability of 93.2%; Right: the same vulnerable function predicted as non-vulnerable with a high probability of 87.4% after adding an optimized adversarial data generated by EaTVul.
Figure 2: The framework of EaTVul.
Figure 3: Framework of feature learning and attention mechanism.
Figure 4: Important features identified by attention mechanism display. The importance decreases from red to yellow.
Figure 5: Raw adversarial data and optimized adversarial data generated by ChatGPT. The top figure contains more than 25 lines of code, while the bottom one displays a more concise version with no more than 8 lines.
...and 1 more figures

EaTVul: ChatGPT-based Evasion Attack Against Software Vulnerability Detection

TL;DR

Abstract

EaTVul: ChatGPT-based Evasion Attack Against Software Vulnerability Detection

Authors

TL;DR

Abstract

Table of Contents

Figures (6)