Targeted Phishing Campaigns using Large Scale Language Models
Rabimba Karanjai
TL;DR
The paper investigates the risk of phishing enabled by large-scale language models by evaluating GPT-2, GPT-3, and OPT on generating convincing phishing emails trained on mixed benign and malicious data. It presents a methodology combining data curation, preprocessing, and prompt-based generation, and analyzes generation quality, evasiveness to defenses, and attacker success signals via token-level statistics. Key findings show LLMs can produce highly persuasive, hard-to-detect emails with performance dependent on model and training data. The work highlights urgent security implications and the need for robust detection and defense strategies as NLG capabilities mature.
Abstract
In this research, we aim to explore the potential of natural language models (NLMs) such as GPT-3 and GPT-2 to generate effective phishing emails. Phishing emails are fraudulent messages that aim to trick individuals into revealing sensitive information or taking actions that benefit the attackers. We propose a framework for evaluating the performance of NLMs in generating these types of emails based on various criteria, including the quality of the generated text, the ability to bypass spam filters, and the success rate of tricking individuals. Our evaluations show that NLMs are capable of generating phishing emails that are difficult to detect and that have a high success rate in tricking individuals, but their effectiveness varies based on the specific NLM and training data used. Our research indicates that NLMs could have a significant impact on the prevalence of phishing attacks and emphasizes the need for further study on the ethical and security implications of using NLMs for malicious purposes.
