Table of Contents
Fetching ...

Phishsense-1B: A Technical Perspective on an AI-Powered Phishing Detection Model

SE Blake

TL;DR

Phishsense-1B addresses the need for accurate yet resource-efficient phishing detection by fine-tuning a large pre-trained language model with Low-Rank Adaptation (LoRA) and a GuardReasoner-inspired base. The two-stage approach separates broad reasoning in the base model from phishing-specific pattern recognition in a lightweight LoRA adapter, enabling high recall and practical deployment. Empirical results show near-perfect recall and strong performance on a custom dataset (0.975 accuracy) and robust results on RealDaten (0.70 accuracy, 0.90 recall), outperforming unadapted models and BERT-based detectors. The work discusses deployment implications, including browser-extension integrations, and outlines future directions for explainability, multilingual support, and continuous online adaptation to counter evolving phishing threats.

Abstract

Phishing is a persistent cybersecurity threat in today's digital landscape. This paper introduces Phishsense-1B, a refined version of the Llama-Guard-3-1B model, specifically tailored for phishing detection and reasoning. This adaptation utilizes Low-Rank Adaptation (LoRA) and the GuardReasoner finetuning methodology. We outline our LoRA-based fine-tuning process, describe the balanced dataset comprising phishing and benign emails, and highlight significant performance improvements over the original model. Our findings indicate that Phishsense-1B achieves an impressive 97.5% accuracy on a custom dataset and maintains strong performance with 70% accuracy on a challenging real-world dataset. This performance notably surpasses both unadapted models and BERT-based detectors. Additionally, we examine current state-of-the-art detection methods, compare prompt-engineering with fine-tuning strategies, and explore potential deployment scenarios.

Phishsense-1B: A Technical Perspective on an AI-Powered Phishing Detection Model

TL;DR

Phishsense-1B addresses the need for accurate yet resource-efficient phishing detection by fine-tuning a large pre-trained language model with Low-Rank Adaptation (LoRA) and a GuardReasoner-inspired base. The two-stage approach separates broad reasoning in the base model from phishing-specific pattern recognition in a lightweight LoRA adapter, enabling high recall and practical deployment. Empirical results show near-perfect recall and strong performance on a custom dataset (0.975 accuracy) and robust results on RealDaten (0.70 accuracy, 0.90 recall), outperforming unadapted models and BERT-based detectors. The work discusses deployment implications, including browser-extension integrations, and outlines future directions for explainability, multilingual support, and continuous online adaptation to counter evolving phishing threats.

Abstract

Phishing is a persistent cybersecurity threat in today's digital landscape. This paper introduces Phishsense-1B, a refined version of the Llama-Guard-3-1B model, specifically tailored for phishing detection and reasoning. This adaptation utilizes Low-Rank Adaptation (LoRA) and the GuardReasoner finetuning methodology. We outline our LoRA-based fine-tuning process, describe the balanced dataset comprising phishing and benign emails, and highlight significant performance improvements over the original model. Our findings indicate that Phishsense-1B achieves an impressive 97.5% accuracy on a custom dataset and maintains strong performance with 70% accuracy on a challenging real-world dataset. This performance notably surpasses both unadapted models and BERT-based detectors. Additionally, we examine current state-of-the-art detection methods, compare prompt-engineering with fine-tuning strategies, and explore potential deployment scenarios.
Paper Structure (16 sections, 5 figures, 2 tables)

This paper contains 16 sections, 5 figures, 2 tables.

Figures (5)

  • Figure 1: High-level schematic of the training workflow. The llama-3.2-1B model is fine-tuned for enhanced reasoning to form the PhishSense-1B base model. Separately, a LoRA-based fine-tuning on llamaguard-3-1B yields a phishing-focused adapter.
  • Figure 2: Inference workflow: The PhishSense-1B base model is augmented with the LoRA adapter. Only a small fraction of the parameters are updated during training, but at inference time, these components together yield a final phishing verdict.
  • Figure 3: ROC plots comparing the base model to the adapter-inclusive model on an adversarial dataset.
  • Figure 4: ROC plots comparing the base model to other comparable models in the Custom Dataset. Note that llama-3.2-1B is not included in other diagrams.
  • Figure 5: ROC plots comparing the base model to other comparable models in the RealDaten Dataset.