Table of Contents
Fetching ...

SoK: Exposing the Generation and Detection Gaps in LLM-Generated Phishing Through Examination of Generation Methods, Content Characteristics, and Countermeasures

Fengchao Chen, Tingmin Wu, Van Nguyen, Carsten Rudolph

TL;DR

The paper addresses the threat of LLM-generated phishing by presenting the first holistic SoK that systematizes the generation-to-defense lifecycle. It introduces a nine-stage taxonomy anchored in three generation paradigms—prompt-guided, data-guided, and adversarial-guided—and analyzes content traits, human factors, and model traits that characterize these attacks. It surveys existing defenses (content-tailored and human-centric) and benchmarking practices, revealing significant gaps where defenses lag behind offensive capabilities, particularly in cross-channel and multimodal contexts. The work offers a roadmap for building standardized datasets, robust evaluation metrics, and adaptive defense frameworks to counter large-scale, personalized phishing at scale.

Abstract

Phishing campaigns involve adversaries masquerading as trusted vendors trying to trigger user behavior that enables them to exfiltrate private data. While URLs are an important part of phishing campaigns, communicative elements like text and images are central in triggering the required user behavior. Further, due to advances in phishing detection, attackers react by scaling campaigns to larger numbers and diversifying and personalizing content. In addition to established mechanisms, such as template-based generation, large language models (LLMs) can be used for phishing content generation, enabling attacks to scale in minutes, challenging existing phishing detection paradigms through personalized content, stealthy explicit phishing keywords, and dynamic adaptation to diverse attack scenarios. Countering these dynamically changing attack campaigns requires a comprehensive understanding of the complex LLM-related threat landscape. Existing studies are fragmented and focus on specific areas. In this work, we provide the first holistic examination of LLM-generated phishing content. First, to trace the exploitation pathways of LLMs for phishing content generation, we adopt a modular taxonomy documenting nine stages by which adversaries breach LLM safety guardrails. We then characterize how LLM-generated phishing manifests as threats, revealing that it evades detectors while emphasizing human cognitive manipulation. Third, by taxonomizing defense techniques aligned with generation methods, we expose a critical asymmetry that offensive mechanisms adapt dynamically to attack scenarios, whereas defensive strategies remain static and reactive. Finally, based on a thorough analysis of the existing literature, we highlight insights and gaps and suggest a roadmap for understanding and countering LLM-driven phishing at scale.

SoK: Exposing the Generation and Detection Gaps in LLM-Generated Phishing Through Examination of Generation Methods, Content Characteristics, and Countermeasures

TL;DR

The paper addresses the threat of LLM-generated phishing by presenting the first holistic SoK that systematizes the generation-to-defense lifecycle. It introduces a nine-stage taxonomy anchored in three generation paradigms—prompt-guided, data-guided, and adversarial-guided—and analyzes content traits, human factors, and model traits that characterize these attacks. It surveys existing defenses (content-tailored and human-centric) and benchmarking practices, revealing significant gaps where defenses lag behind offensive capabilities, particularly in cross-channel and multimodal contexts. The work offers a roadmap for building standardized datasets, robust evaluation metrics, and adaptive defense frameworks to counter large-scale, personalized phishing at scale.

Abstract

Phishing campaigns involve adversaries masquerading as trusted vendors trying to trigger user behavior that enables them to exfiltrate private data. While URLs are an important part of phishing campaigns, communicative elements like text and images are central in triggering the required user behavior. Further, due to advances in phishing detection, attackers react by scaling campaigns to larger numbers and diversifying and personalizing content. In addition to established mechanisms, such as template-based generation, large language models (LLMs) can be used for phishing content generation, enabling attacks to scale in minutes, challenging existing phishing detection paradigms through personalized content, stealthy explicit phishing keywords, and dynamic adaptation to diverse attack scenarios. Countering these dynamically changing attack campaigns requires a comprehensive understanding of the complex LLM-related threat landscape. Existing studies are fragmented and focus on specific areas. In this work, we provide the first holistic examination of LLM-generated phishing content. First, to trace the exploitation pathways of LLMs for phishing content generation, we adopt a modular taxonomy documenting nine stages by which adversaries breach LLM safety guardrails. We then characterize how LLM-generated phishing manifests as threats, revealing that it evades detectors while emphasizing human cognitive manipulation. Third, by taxonomizing defense techniques aligned with generation methods, we expose a critical asymmetry that offensive mechanisms adapt dynamically to attack scenarios, whereas defensive strategies remain static and reactive. Finally, based on a thorough analysis of the existing literature, we highlight insights and gaps and suggest a roadmap for understanding and countering LLM-driven phishing at scale.

Paper Structure

This paper contains 20 sections, 4 figures, 11 tables.

Figures (4)

  • Figure 1: A Full-Lifecycle of LLM-Enabled Phishing: From Generation, Characterization to Defense. The taxonomy traces how different LLMs are exploited to generate phishing content with varied techniques (RQ1); how these attacks exhibit distinctive patterns and linguistic features (RQ2); and how tailored detection and defense strategies address LLM-facilitated text-based phishing attacks. Dataset availability and evaluation metrics are examined throughout the lifecycle (RQ4).
  • Figure 2: Flow diagram summarizing paper screening along the three research questions.
  • Figure 3: The distribution of screened paper that covers the work for each year based on research questions (from 2018 to 2025.10). Numbers inside the bars denote the count of studies focused on generation, characterization, and defense. The numbers in front of each bar are the total count of papers in that year.
  • Figure 4: Full-lifecycle taxonomy of LLM-generated phishing, spanning generation techniques, attack characteristics, and corresponding defense strategies. Lines on the right refer to a defense technique that can counter a specific attack or a group of attacks. Lines on the left represent studied characteristics of LLM-generated phishing arising from different generation methods.