Table of Contents
Fetching ...

A Decade-long Landscape of Advanced Persistent Threats: Longitudinal Analysis and Global Trends

Shakhzod Yuldoshkhujaev, Mijin Jeon, Doowon Kim, Nick Nikiforakis, Hyungjoon Koo

TL;DR

This work presents a decade-long macro-analysis of advanced persistent threats (2014–2023) by aggregating $1{,}509$ open technical reports, $603$ threat actors, and $24{,}215$ pages to map global APT trends across victim countries, actors, targets, and initial infiltration methods. A hybrid information-retrieval pipeline combines large-language-model (LLM) inference with rule-based extraction to retrieve ten CTI-relevant items (e.g., MITRE ATT&CK IDs, CVEs, YARA rules) from diverse sources, with GPT-4-Turbo delivering the strongest retrieval performance among LLMs and IoCParser augmenting CVE/MITRE/YARA extraction. Key findings include broad global reach (154 countries), dominance of the US as a target, Lazarus as the most active actor, prevalent use of malicious documents and spear phishing, and a decline in zero-day exploitation after 2016, alongside significant links to external political and economic events. The paper also delivers interactive visualizations (an APT map, flow diagrams, and timelines) and publicly releases the curated dataset to support ongoing CTI research and policy development. The work highlights the value of macro-scale, data-driven CTI synthesis for understanding APT evolution and informing defense strategies.

Abstract

An advanced persistent threat (APT) refers to a covert, long-term cyberattack, typically conducted by state-sponsored actors, targeting critical sectors and often remaining undetected for long periods. In response, collective intelligence from around the globe collaborates to identify and trace surreptitious activities, generating substantial documentation on APT campaigns publicly available on the web. While prior works predominantly focus on specific aspects of APT cases, such as detection, evaluation, cyber threat intelligence, and dataset creation, limited attention has been devoted to revisiting and investigating these scattered dossiers in a longitudinal manner. The objective of our study is to fill the gap by offering a macro perspective, connecting key insights and global trends in past APT attacks. We systematically analyze six reliable sources-three focused on technical reports and another three on threat actors-examining 1,509 APT dossiers (24,215 pages) spanning 2014-2023, and identifying 603 unique APT groups worldwide. To efficiently unearth relevant information, we employ a hybrid methodology that combines rule-based information retrieval with large-language-model-based search techniques. Our longitudinal analysis reveals shifts in threat actor activities, global attack vectors, changes in targeted sectors, and relationships between cyberattacks and significant events such as elections or wars, which provide insights into historical patterns in APT evolution. Over the past decade, 154 countries have been affected, primarily using malicious documents and spear phishing as dominant initial infiltration vectors, with a noticeable decline in zero-day exploitation since 2016. Furthermore, we present our findings through interactive visualization tools, such as an APT map or flow diagram, to facilitate intuitive understanding of global patterns and trends in APT activities.

A Decade-long Landscape of Advanced Persistent Threats: Longitudinal Analysis and Global Trends

TL;DR

This work presents a decade-long macro-analysis of advanced persistent threats (2014–2023) by aggregating open technical reports, threat actors, and pages to map global APT trends across victim countries, actors, targets, and initial infiltration methods. A hybrid information-retrieval pipeline combines large-language-model (LLM) inference with rule-based extraction to retrieve ten CTI-relevant items (e.g., MITRE ATT&CK IDs, CVEs, YARA rules) from diverse sources, with GPT-4-Turbo delivering the strongest retrieval performance among LLMs and IoCParser augmenting CVE/MITRE/YARA extraction. Key findings include broad global reach (154 countries), dominance of the US as a target, Lazarus as the most active actor, prevalent use of malicious documents and spear phishing, and a decline in zero-day exploitation after 2016, alongside significant links to external political and economic events. The paper also delivers interactive visualizations (an APT map, flow diagrams, and timelines) and publicly releases the curated dataset to support ongoing CTI research and policy development. The work highlights the value of macro-scale, data-driven CTI synthesis for understanding APT evolution and informing defense strategies.

Abstract

An advanced persistent threat (APT) refers to a covert, long-term cyberattack, typically conducted by state-sponsored actors, targeting critical sectors and often remaining undetected for long periods. In response, collective intelligence from around the globe collaborates to identify and trace surreptitious activities, generating substantial documentation on APT campaigns publicly available on the web. While prior works predominantly focus on specific aspects of APT cases, such as detection, evaluation, cyber threat intelligence, and dataset creation, limited attention has been devoted to revisiting and investigating these scattered dossiers in a longitudinal manner. The objective of our study is to fill the gap by offering a macro perspective, connecting key insights and global trends in past APT attacks. We systematically analyze six reliable sources-three focused on technical reports and another three on threat actors-examining 1,509 APT dossiers (24,215 pages) spanning 2014-2023, and identifying 603 unique APT groups worldwide. To efficiently unearth relevant information, we employ a hybrid methodology that combines rule-based information retrieval with large-language-model-based search techniques. Our longitudinal analysis reveals shifts in threat actor activities, global attack vectors, changes in targeted sectors, and relationships between cyberattacks and significant events such as elections or wars, which provide insights into historical patterns in APT evolution. Over the past decade, 154 countries have been affected, primarily using malicious documents and spear phishing as dominant initial infiltration vectors, with a noticeable decline in zero-day exploitation since 2016. Furthermore, we present our findings through interactive visualization tools, such as an APT map or flow diagram, to facilitate intuitive understanding of global patterns and trends in APT activities.

Paper Structure

This paper contains 23 sections, 11 figures, 11 tables.

Figures (11)

  • Figure 1: Overview of our methodology for longitudinal APT analysis. We collect technical reports, threat actors, and news articles across the web (Section \ref{['sec:dataset']}). Then, we probe valuable information from technical reports based on rules and LLMs (Section \ref{['ss:retrieval']}). Note that we manually inspect attack duration (*) for precise analysis. Next, we refine raw information via normalization, categorization, de-duplication, and filtering (Section \ref{['subsec:sanitize']}). Lastly, we conduct in-depth analyses to answer our research questions (Section \ref{['sec:eval']}).
  • Figure 2: Top 15 sources from the collection of technical reports. Most reports come from reputable sources such as Kaspersky kasperskyInfo and Trend Micro TrendMicroDescr. We confirmed that 1,412 (93.6%) TRs are highly credible (Section \ref{['sec:dataset']}).
  • Figure 3: Summary of global APT trends over the past decade. Each box represents six key points for a given year: the number of APT campaigns, the number of associated malware samples, the most frequently attacked country, the primary threat actor, the most commonly used initial attack vector, and the most targeted sector. For the last 10 years, the most frequently targeted country, the most active APT group, the most predominant attack method, and most commonly targeted sector have been the United States, Lazarus group, malicious documents (and spear phishing almost equally contributed), and the government/defense sector, respectively (Section \ref{['ss:rq1']}). Note that the numbers in parentheses represent the occurrences per year.
  • Figure 4: Decadal trends in APT activity by victim countries (left) and threat actors (right). A circle size reflects the frequency of APT incidents, while color gradation represents the number of zero-day vulnerabilities associated with each entity as a concrete value ( i.e., lighter red indicates fewer occurrences) (Section \ref{['ss:rq1']}).
  • Figure 5: Decadal APT trends in 12 target sectors (left) and 12 initial attack vectors (right). We follow the categories of sectors from the guides sectorGrouping_sectrio_apt_guidesectorGrouping_google and the attack vectors that Sharma et al.APT_new_survey proposed. The figures illustrate the distributions of each sector/vector over the last 10 years. The percentages within a stacked bar chart indicate the three most common target sectors and attack vectors for each year, along with their respective proportions. Note that a single APT case may entail multiple target sectors or attack vectors, which we count individually (Section \ref{['ss:rq1']}).
  • ...and 6 more figures