Table of Contents
Fetching ...

SpearBot: Leveraging Large Language Models in a Generative-Critique Framework for Spear-Phishing Email Generation

Qinglin Qi, Yun Luo, Yijia Xu, Wenbo Guo, Yong Fang

TL;DR

This work presents SpearBot, an adversarial framework that uses jailbreak prompts and multi-LLM critics to generate highly personalized spear-phishing emails. By combining data-driven personal information, a 10-type phishing strategy taxonomy, and critique-based optimization, SpearBot demonstrates strong deception capabilities that bypass a range of machine-based and PLM detectors, while human evaluators confirm high readability and deception. The study provides extensive evaluations across six public phishing datasets, multiple defenders, and human subjects, revealing gaps in current defenses and the pivotal role of critics in enhancing adversarial quality. The findings underscore significant security risks posed by advanced LLMs and advocate for stronger, multi-faceted defenses and ethical considerations in AI deployment.

Abstract

Large Language Models (LLMs) are increasingly capable, aiding in tasks such as content generation, yet they also pose risks, particularly in generating harmful spear-phishing emails. These emails, crafted to entice clicks on malicious URLs, threaten personal information security. This paper proposes an adversarial framework, SpearBot, which utilizes LLMs to generate spear-phishing emails with various phishing strategies. Through specifically crafted jailbreak prompts, SpearBot circumvents security policies and introduces other LLM instances as critics. When a phishing email is identified by the critic, SpearBot refines the generated email based on the critique feedback until it can no longer be recognized as phishing, thereby enhancing its deceptive quality. To evaluate the effectiveness of SpearBot, we implement various machine-based defenders and assess how well the phishing emails generated could deceive them. Results show these emails often evade detection to a large extent, underscoring their deceptive quality. Additionally, human evaluations of the emails' readability and deception are conducted through questionnaires, confirming their convincing nature and the significant potential harm of the generated phishing emails.

SpearBot: Leveraging Large Language Models in a Generative-Critique Framework for Spear-Phishing Email Generation

TL;DR

This work presents SpearBot, an adversarial framework that uses jailbreak prompts and multi-LLM critics to generate highly personalized spear-phishing emails. By combining data-driven personal information, a 10-type phishing strategy taxonomy, and critique-based optimization, SpearBot demonstrates strong deception capabilities that bypass a range of machine-based and PLM detectors, while human evaluators confirm high readability and deception. The study provides extensive evaluations across six public phishing datasets, multiple defenders, and human subjects, revealing gaps in current defenses and the pivotal role of critics in enhancing adversarial quality. The findings underscore significant security risks posed by advanced LLMs and advocate for stronger, multi-faceted defenses and ethical considerations in AI deployment.

Abstract

Large Language Models (LLMs) are increasingly capable, aiding in tasks such as content generation, yet they also pose risks, particularly in generating harmful spear-phishing emails. These emails, crafted to entice clicks on malicious URLs, threaten personal information security. This paper proposes an adversarial framework, SpearBot, which utilizes LLMs to generate spear-phishing emails with various phishing strategies. Through specifically crafted jailbreak prompts, SpearBot circumvents security policies and introduces other LLM instances as critics. When a phishing email is identified by the critic, SpearBot refines the generated email based on the critique feedback until it can no longer be recognized as phishing, thereby enhancing its deceptive quality. To evaluate the effectiveness of SpearBot, we implement various machine-based defenders and assess how well the phishing emails generated could deceive them. Results show these emails often evade detection to a large extent, underscoring their deceptive quality. Additionally, human evaluations of the emails' readability and deception are conducted through questionnaires, confirming their convincing nature and the significant potential harm of the generated phishing emails.

Paper Structure

This paper contains 36 sections, 12 figures, 6 tables, 1 algorithm.

Figures (12)

  • Figure 1: The process of conducting phishing email attacks can be significantly enhanced by employing LLMs, which not only reduce costs but increase the level of deception.
  • Figure 2: The framework in our spear-phishing email generation, which includes two main procedure. The jailbreak initialization for generating phishing email and the critique-based optimization to enhance the deception of the spear-phishing email.
  • Figure 3: The refuse of GPT-4 for generating phishing emails since the model has been aligned with human values and equipped with safety filters.
  • Figure 4: The evaluation methods of phishing emails, which are divided into machine-based evaluation and human evaluation. The former contains three types of defenders, including Machine Learning (ML), Pre-trained Language Model (PLM), and Large Language Model (LLM) Defenders. The latter measures the readability and deceptiveness of the generated spear-phishing emails.
  • Figure 5: The accuracies of the different strategies of our SpearBot, where the order of strategies are shown in accordance to Table \ref{['eval_strategy']}.
  • ...and 7 more figures