ThreatLinker: An NLP-based Methodology to Automatically Estimate CVE Relevance for CAPEC Attack Patterns
Andrea Ciavotta, Alessandro Palma, Simone Lenti, Silvia Bonomi
TL;DR
ThreatLinker tackles the challenging problem of automatically linking CVE vulnerabilities to CAPEC attack patterns by fusing semantic similarity with targeted keyword signals. It employs SBERT and ATTACK BERT to capture meaning, while a comprehensive keyword module leverages acronyms and domain terms, combined through a weighted score S_overall = α · S_semantic + (1 − α) · S_keyword with α = 0.3, to rank CAPEC patterns for each CVE. A larger, expert-validated CVE–CAPEC dataset (GT2) and a replication GT (GT1) show that ThreatLinker outperforms state-of-the-art baselines in Recall@K and MRR, though Precision@K remains modest due to evaluation constraints and dataset characteristics. The work advances automated threat modeling by delivering a publicly released dataset and a robust hybrid framework that reduces manual effort and improves actionable linkages between vulnerabilities and attack patterns, with future work to incorporate CAPEC hierarchies and richer attributes for further gains.
Abstract
Threat analysis is continuously growing in importance due to the always-increasing complexity and frequency of cyber attacks. Analyzing threats demands significant effort from security experts: different cybersecurity knowledge bases support this task, but manual efforts are required to correlate heterogeneous sources into a unified view that would enable a more comprehensive assessment. To address this gap, we propose ThreatLinker, a methodology leveraging Natural Language Processing (NLP) to effectively and efficiently associate Common Vulnerabilities and Exposure (CVE) vulnerabilities with Common Attack Pattern Enumeration and Classification (CAPEC) attack patterns. The proposed technique combines semantic similarity with keyword analysis to improve the accuracy of association estimations. We contributed a larger dataset for CVE-CAPEC correlation, and experimental evaluations demonstrate superior performance compared to state-of-the-art models.
