Table of Contents
Fetching ...

Machine Learning Techniques for Python Source Code Vulnerability Detection

Talaya Farasat, Joachim Posegga

TL;DR

The paper addresses detecting vulnerabilities in Python source code using machine learning, focusing on common vulnerability categories in Python programs. It compares five models—Gaussian Naive Bayes, Decision Tree, Logistic Regression, MLP, and BiLSTM—using word2vec token embeddings trained on a GitHub-derived dataset of vulnerability-fixed commits. BiLSTM hyperparameters are tuned with a configuration featuring 1 input layer, 3 hidden BiLSTM layers of 50 units, four dropout layers of 0.2, Adam optimization, and 50 training epochs with batch size 128. The BiLSTM with word2vec achieves an average accuracy of 98.6%, F1-score of 94.7%, and ROC of 99.3%, establishing a new benchmark for Python source code vulnerability detection. The work also provides open-source code and models to support reproducibility and practical deployment for researchers and developers.

Abstract

Software vulnerabilities are a fundamental reason for the prevalence of cyber attacks and their identification is a crucial yet challenging problem in cyber security. In this paper, we apply and compare different machine learning algorithms for source code vulnerability detection specifically for Python programming language. Our experimental evaluation demonstrates that our Bidirectional Long Short-Term Memory (BiLSTM) model achieves a remarkable performance (average Accuracy = 98.6%, average F-Score = 94.7%, average Precision = 96.2%, average Recall = 93.3%, average ROC = 99.3%), thereby, establishing a new benchmark for vulnerability detection in Python source code.

Machine Learning Techniques for Python Source Code Vulnerability Detection

TL;DR

The paper addresses detecting vulnerabilities in Python source code using machine learning, focusing on common vulnerability categories in Python programs. It compares five models—Gaussian Naive Bayes, Decision Tree, Logistic Regression, MLP, and BiLSTM—using word2vec token embeddings trained on a GitHub-derived dataset of vulnerability-fixed commits. BiLSTM hyperparameters are tuned with a configuration featuring 1 input layer, 3 hidden BiLSTM layers of 50 units, four dropout layers of 0.2, Adam optimization, and 50 training epochs with batch size 128. The BiLSTM with word2vec achieves an average accuracy of 98.6%, F1-score of 94.7%, and ROC of 99.3%, establishing a new benchmark for Python source code vulnerability detection. The work also provides open-source code and models to support reproducibility and practical deployment for researchers and developers.

Abstract

Software vulnerabilities are a fundamental reason for the prevalence of cyber attacks and their identification is a crucial yet challenging problem in cyber security. In this paper, we apply and compare different machine learning algorithms for source code vulnerability detection specifically for Python programming language. Our experimental evaluation demonstrates that our Bidirectional Long Short-Term Memory (BiLSTM) model achieves a remarkable performance (average Accuracy = 98.6%, average F-Score = 94.7%, average Precision = 96.2%, average Recall = 93.3%, average ROC = 99.3%), thereby, establishing a new benchmark for vulnerability detection in Python source code.
Paper Structure (4 sections, 1 figure, 2 tables)

This paper contains 4 sections, 1 figure, 2 tables.

Figures (1)

  • Figure 1: ROC Curves (see remaining ROC results here: https://github.com/Tf-arch/Python-Source-Code-Vulnerability-Detection/tree/main)