Table of Contents
Fetching ...

VulDeePecker: A Deep Learning-Based System for Vulnerability Detection

Zhen Li, Deqing Zou, Shouhuai Xu, Xinyu Ou, Hai Jin, Sujuan Wang, Zhijun Deng, Yuyi Zhong

TL;DR

This work introduces VulDeePecker, a deep learning-based system that detects software vulnerabilities by learning patterns from code gadgets—small semantically related code fragments centered on library/API calls. It defines a code gadget representation, builds a BLSTM-based model, and provides the first vulnerability dataset derived from NVD and SARD to train and evaluate the approach. Results show VulDeePecker achieves substantially lower false negatives and competitive false positives compared to traditional pattern-based tools and code-similarity approaches, while also uncovering vulnerabilities not reported in the NVD. The study highlights practical benefits and limitations of applying deep learning to vulnerability detection and opens avenues for data-driven, feature-light defense mechanisms.

Abstract

The automatic detection of software vulnerabilities is an important research problem. However, existing solutions to this problem rely on human experts to define features and often miss many vulnerabilities (i.e., incurring high false negative rate). In this paper, we initiate the study of using deep learning-based vulnerability detection to relieve human experts from the tedious and subjective task of manually defining features. Since deep learning is motivated to deal with problems that are very different from the problem of vulnerability detection, we need some guiding principles for applying deep learning to vulnerability detection. In particular, we need to find representations of software programs that are suitable for deep learning. For this purpose, we propose using code gadgets to represent programs and then transform them into vectors, where a code gadget is a number of (not necessarily consecutive) lines of code that are semantically related to each other. This leads to the design and implementation of a deep learning-based vulnerability detection system, called Vulnerability Deep Pecker (VulDeePecker). In order to evaluate VulDeePecker, we present the first vulnerability dataset for deep learning approaches. Experimental results show that VulDeePecker can achieve much fewer false negatives (with reasonable false positives) than other approaches. We further apply VulDeePecker to 3 software products (namely Xen, Seamonkey, and Libav) and detect 4 vulnerabilities, which are not reported in the National Vulnerability Database but were "silently" patched by the vendors when releasing later versions of these products; in contrast, these vulnerabilities are almost entirely missed by the other vulnerability detection systems we experimented with.

VulDeePecker: A Deep Learning-Based System for Vulnerability Detection

TL;DR

This work introduces VulDeePecker, a deep learning-based system that detects software vulnerabilities by learning patterns from code gadgets—small semantically related code fragments centered on library/API calls. It defines a code gadget representation, builds a BLSTM-based model, and provides the first vulnerability dataset derived from NVD and SARD to train and evaluate the approach. Results show VulDeePecker achieves substantially lower false negatives and competitive false positives compared to traditional pattern-based tools and code-similarity approaches, while also uncovering vulnerabilities not reported in the NVD. The study highlights practical benefits and limitations of applying deep learning to vulnerability detection and opens avenues for data-driven, feature-light defense mechanisms.

Abstract

The automatic detection of software vulnerabilities is an important research problem. However, existing solutions to this problem rely on human experts to define features and often miss many vulnerabilities (i.e., incurring high false negative rate). In this paper, we initiate the study of using deep learning-based vulnerability detection to relieve human experts from the tedious and subjective task of manually defining features. Since deep learning is motivated to deal with problems that are very different from the problem of vulnerability detection, we need some guiding principles for applying deep learning to vulnerability detection. In particular, we need to find representations of software programs that are suitable for deep learning. For this purpose, we propose using code gadgets to represent programs and then transform them into vectors, where a code gadget is a number of (not necessarily consecutive) lines of code that are semantically related to each other. This leads to the design and implementation of a deep learning-based vulnerability detection system, called Vulnerability Deep Pecker (VulDeePecker). In order to evaluate VulDeePecker, we present the first vulnerability dataset for deep learning approaches. Experimental results show that VulDeePecker can achieve much fewer false negatives (with reasonable false positives) than other approaches. We further apply VulDeePecker to 3 software products (namely Xen, Seamonkey, and Libav) and detect 4 vulnerabilities, which are not reported in the National Vulnerability Database but were "silently" patched by the vendors when releasing later versions of these products; in contrast, these vulnerabilities are almost entirely missed by the other vulnerability detection systems we experimented with.

Paper Structure

This paper contains 34 sections, 6 equations, 5 figures, 7 tables.

Figures (5)

  • Figure 1: A brief review of BLSTM neural network
  • Figure 2: Overview of VulDeePecker: the learning phase generates vulnerability patterns, and the detection phase uses these vulnerability patterns to determine whether a target program is vulnerable or not and if so, the locations of the vulnerabilities (i.e., the corresponding code gadgets).
  • Figure 3: Illustrating the extraction of library/API function calls (Step I.1) from a (training) program, which contains a backward function call (i.e., $strcpy$) that is also used as an example to demonstrate the extraction of program slices (Step I.2) and the assembly of program slices into code gadgets (Step II.1).
  • Figure 4: Illustration of Step III.1: transforming code gadgets into their symbolic representations
  • Figure 5: F1-measure of VulDeePecker for the 6 datasets with different number of hidden layers

Theorems & Definitions (1)

  • Definition 1