Assessing and Prioritizing Ransomware Risk Based on Historical Victim Data

Spencer Massengale; Philip Huff

Assessing and Prioritizing Ransomware Risk Based on Historical Victim Data

Spencer Massengale, Philip Huff

TL;DR

The paper addresses predicting ransomware targeting using historical victim data to prioritize defenses. It introduces a pipeline leveraging LLMs to extract SKRAM threat actor profiles from public disclosures, transform outputs into STIX, and augment data synthetically. A Random Forest classifier predicts a risk score for entities, validated on a dataset augmented with synthetic data and time-sensitive activity via an EWMA metric $V_t = \lambda V_{t-1} + (1-\lambda) x_t$. It discusses limitations due to data scarcity and geographic bias and outlines future directions, including broader LLM comparisons and real-world data integration, to improve actionable ransomware risk assessment for organizations.

Abstract

We present an approach to identifying which ransomware adversaries are most likely to target specific entities, thereby assisting these entities in formulating better protection strategies. Ransomware poses a formidable cybersecurity threat characterized by profit-driven motives, a complex underlying economy supporting criminal syndicates, and the overt nature of its attacks. This type of malware has consistently ranked among the most prevalent, with a rapid escalation in activity observed. Recent estimates indicate that approximately two-thirds of organizations experienced ransomware attacks in 2023 \cite{Sophos2023Ransomware}. A central tactic in ransomware campaigns is publicizing attacks to coerce victims into paying ransoms. Our study utilizes public disclosures from ransomware victims to predict the likelihood of an entity being targeted by a specific ransomware variant. We employ a Large Language Model (LLM) architecture that uses a unique chain-of-thought, multi-shot prompt methodology to define adversary SKRAM (Skills, Knowledge, Resources, Authorities, and Motivation) profiles from ransomware bulletins, threat reports, and news items. This analysis is enriched with publicly available victim data and is further enhanced by a heuristic for generating synthetic data that reflects victim profiles. Our work culminates in the development of a machine learning model that assists organizations in prioritizing ransomware threats and formulating defenses based on the tactics, techniques, and procedures (TTP) of the most likely attackers.

Assessing and Prioritizing Ransomware Risk Based on Historical Victim Data

TL;DR

Abstract

Assessing and Prioritizing Ransomware Risk Based on Historical Victim Data

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (6)