SoK: On Closing the Applicability Gap in Automated Vulnerability Detection

Ezzeldin Shereen; Dan Ristea; Sanyam Vyas; Shae McFadden; Madeleine Dwyer; Chris Hicks; Vasilios Mavroudis

SoK: On Closing the Applicability Gap in Automated Vulnerability Detection

Ezzeldin Shereen, Dan Ristea, Sanyam Vyas, Shae McFadden, Madeleine Dwyer, Chris Hicks, Vasilios Mavroudis

TL;DR

This SoK systematically analyzes Automated Vulnerability Detection (AVD) literature (79 articles, 17 empirical studies) across five core components to assess real-world applicability. It finds a narrow emphasis on function-level binary classification in C/C++, with insufficient multilingual support and heterogeneous evaluation practices that threaten reproducibility. The authors identify key gaps—diverse problem formulations, granularities, dataset quality, and open science—and propose directions such as cross-domain transfer learning, inclusion of binaries, and standardized benchmarks to bridge the applicability gap. The work highlights LLM- and GNN-based approaches as leading trends while cautioning about data leakage and the need for realistic, time-aware evaluations to drive practical adoption in software security.

Abstract

The frequent discovery of security vulnerabilities in both open-source and proprietary software underscores the urgent need for earlier detection during the development lifecycle. Initiatives such as DARPA's Artificial Intelligence Cyber Challenge (AIxCC) aim to accelerate Automated Vulnerability Detection (AVD), seeking to address this challenge by autonomously analyzing source code to identify vulnerabilities. This paper addresses two primary research questions: (RQ1) How is current AVD research distributed across its core components? (RQ2) What key areas should future research target to bridge the gap in the practical applicability of AVD throughout software development? To answer these questions, we conduct a systematization over 79 AVD articles and 17 empirical studies, analyzing them across five core components: task formulation and granularity, input programming languages and representations, detection approaches and key solutions, evaluation metrics and datasets, and reported performance. Our systematization reveals that the narrow focus of AVD research-mainly on specific tasks and programming languages-limits its practical impact and overlooks broader areas crucial for effective, real-world vulnerability detection. We identify significant challenges, including the need for diversified problem formulations, varied detection granularities, broader language support, better dataset quality, enhanced reproducibility, and increased practical impact. Based on these findings we identify research directions that will enhance the effectiveness and applicability of AVD solutions in software security.

SoK: On Closing the Applicability Gap in Automated Vulnerability Detection

TL;DR

Abstract

SoK: On Closing the Applicability Gap in Automated Vulnerability Detection

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (8)