Enhanced LLM-Based Framework for Predicting Null Pointer Dereference in Source Code
Md. Fahim Sultan, Tasmin Karim, Md. Shazzad Hossain Shaon, Mohammad Wardat, Mst Shapna Akter
TL;DR
The paper tackles the detection of CWE-476 NULL pointer dereferences in source code by proposing DeLLNeuN, a fine-tuned LLM-based framework that leverages multi-layer CodeBERT representations to improve vulnerability prediction. By combining CodeBERT-derived features with a dropout, dense, and sigmoid classifier, DeLLNeuN outperforms several baselines (CodeBERT, GraphCodeBERT, RoBERTa, LSTM, GPT-2) on the Draper VDISC dataset, achieving around 0.87 accuracy and 0.88 precision. The study demonstrates the value of aggregating layer-wise CodeBERT information to enhance vulnerability detection in code, suggesting practical potential as an early vulnerability checker in software development. The work highlights both the benefits and limitations of current LLM-based code analysis approaches and emphasizes the need for scalable, resource-aware strategies to handle real-world, large-scale codebases for proactive cybersecurity.
Abstract
Software security is crucial in any field where breaches can exploit sensitive data, and lead to financial losses. As a result, vulnerability detection becomes an essential part of the software development process. One of the key steps in maintaining software integrity is identifying vulnerabilities in the source code before deployment. A security breach like CWE-476, which stands for NULL pointer dereferences (NPD), is crucial because it can cause software crashes, unpredictable behavior, and security vulnerabilities. In this scientific era, there are several vulnerability checkers, where, previous tools often fall short in analyzing specific feature connections of the source code, which weakens the tools in real-world scenarios. In this study, we propose another novel approach using a fine-tuned Large Language Model (LLM) termed "DeLLNeuN". This model leverages the advantage of various layers to reduce both overfitting and non-linearity, enhancing its performance and reliability. Additionally, this method provides dropout and dimensionality reduction to help streamline the model, making it faster and more efficient. Our model showed 87% accuracy with 88% precision using the Draper VDISC dataset. As software becomes more complex and cyber threats continuously evolve, the need for proactive security measures will keep growing. In this particular case, the proposed model looks promising to use as an early vulnerability checker in software development.
