A Systematic Approach to Predict the Impact of Cybersecurity Vulnerabilities Using LLMs
Anders Mølmen Høst, Pierre Lison, Leon Moonen
TL;DR
The paper tackles the problem that vulnerability databases like the NVD do not directly connect CVEs to ATT&CK TTPs, complicating risk assessment. It introduces TRIAGE, a hybrid pipeline that combines MITRE's CVE Mapping Methodology (CMM) based rule-based mappers with an in-context learning LLM module to automatically map CVEs to ATT&CK techniques and rank them by likelihood. Empirical evaluation on the KEV dataset shows that in-context learning largely outperforms individual mappers, and the hybrid approach yields higher exploitation-technique recall, with GPT-4o-mini outperforming Llama3.3-70B; a replication package is provided. The work demonstrates a scalable, cost-aware method to predict vulnerability impact, aiding defense planning and threat-hunting efforts, while highlighting remaining challenges in predicting secondary impacts and in aligning with evolving ATT&CK techniques.
Abstract
Vulnerability databases, such as the National Vulnerability Database (NVD), offer detailed descriptions of Common Vulnerabilities and Exposures (CVEs), but often lack information on their real-world impact, such as the tactics, techniques, and procedures (TTPs) that adversaries may use to exploit the vulnerability. However, manually linking CVEs to their corresponding TTPs is a challenging and time-consuming task, and the high volume of new vulnerabilities published annually makes automated support desirable. This paper introduces TRIAGE, a two-pronged automated approach that uses Large Language Models (LLMs) to map CVEs to relevant techniques from the ATT&CK knowledge base. We first prompt an LLM with instructions based on MITRE's CVE Mapping Methodology to predict an initial list of techniques. This list is then combined with the results from a second LLM-based module that uses in-context learning to map a CVE to relevant techniques. This hybrid approach strategically combines rule-based reasoning with data-driven inference. Our evaluation reveals that in-context learning outperforms the individual mapping methods, and the hybrid approach improves recall of exploitation techniques. We also find that GPT-4o-mini performs better than Llama3.3-70B on this task. Overall, our results show that LLMs can be used to automatically predict the impact of cybersecurity vulnerabilities and TRIAGE makes the process of mapping CVEs to ATT&CK more efficient. A replication package is available for download from https://doi.org/10.5281/zenodo.17341503. Keywords: vulnerability impact, CVE, ATT&CK techniques, large language models, automated mapping.
