Table of Contents
Fetching ...

An empirical analysis of zero-day vulnerabilities disclosed by the zero day initiative

Apurva Shet, Izzat Alsmadi

TL;DR

This study analyzes 415 ZDI-disclosed zero-day vulnerabilities from Jan–Apr 2024 to predict high-severity CVSS scores using a fusion of structured metadata and textual descriptions. It compares classical machine learning with deep learning approaches, employing multiple feature engineering and dimensionality-reduction pipelines, and emphasizes metrics beyond accuracy to address class imbalance. Key findings show that high severity correlates with widely deployed vendors and exploit-related keywords, while dimensionality reduction maintains accuracy and improves interpretability; DL models leveraging textual context enhance recall, achieving ROC-AUC near 0.99 in several setups. The work demonstrates practical value for vulnerability prioritization and patch management, with future directions including larger temporal windows and explainable AI for transparency.

Abstract

Zero-day vulnerabilities represent some of the most critical threats in cybersecurity, as they correspond to previously unknown flaws in software or hardware that are actively exploited before vendors can develop and deploy patches. During this exposure window, affected systems remain defenseless, making zero-day attacks particularly damaging and difficult to mitigate. This study analyzes the Zero Day Initiative (ZDI) vulnerability disclosures reported between January and April 2024, Cole [2025] comprising a total of 415 vulnerabilities. The dataset includes vulnerability identifiers, Common Vulnerability Scoring System (CVSS) v3.0 scores, publication dates, and short textual descriptions. The primary objectives of this work are to identify trends in zero-day vulnerability disclosures, examine severity distributions across vendors, and investigate which vulnerability characteristics are most indicative of high severity. In addition, this study explores predictive modeling approaches for severity classification, comparing classical machine learning techniques with deep learning models using both structured metadata and unstructured textual descriptions. The findings aim to support improved patch prioritization strategies, more effective vulnerability management, and enhanced organizational preparedness against emerging zero-day threats.

An empirical analysis of zero-day vulnerabilities disclosed by the zero day initiative

TL;DR

This study analyzes 415 ZDI-disclosed zero-day vulnerabilities from Jan–Apr 2024 to predict high-severity CVSS scores using a fusion of structured metadata and textual descriptions. It compares classical machine learning with deep learning approaches, employing multiple feature engineering and dimensionality-reduction pipelines, and emphasizes metrics beyond accuracy to address class imbalance. Key findings show that high severity correlates with widely deployed vendors and exploit-related keywords, while dimensionality reduction maintains accuracy and improves interpretability; DL models leveraging textual context enhance recall, achieving ROC-AUC near 0.99 in several setups. The work demonstrates practical value for vulnerability prioritization and patch management, with future directions including larger temporal windows and explainable AI for transparency.

Abstract

Zero-day vulnerabilities represent some of the most critical threats in cybersecurity, as they correspond to previously unknown flaws in software or hardware that are actively exploited before vendors can develop and deploy patches. During this exposure window, affected systems remain defenseless, making zero-day attacks particularly damaging and difficult to mitigate. This study analyzes the Zero Day Initiative (ZDI) vulnerability disclosures reported between January and April 2024, Cole [2025] comprising a total of 415 vulnerabilities. The dataset includes vulnerability identifiers, Common Vulnerability Scoring System (CVSS) v3.0 scores, publication dates, and short textual descriptions. The primary objectives of this work are to identify trends in zero-day vulnerability disclosures, examine severity distributions across vendors, and investigate which vulnerability characteristics are most indicative of high severity. In addition, this study explores predictive modeling approaches for severity classification, comparing classical machine learning techniques with deep learning models using both structured metadata and unstructured textual descriptions. The findings aim to support improved patch prioritization strategies, more effective vulnerability management, and enhanced organizational preparedness against emerging zero-day threats.

Paper Structure

This paper contains 11 sections, 12 figures, 4 tables.

Figures (12)

  • Figure :
  • Figure :
  • Figure :
  • Figure :
  • Figure :
  • ...and 7 more figures