Phishsense-1B: A Technical Perspective on an AI-Powered Phishing Detection Model
SE Blake
TL;DR
Phishsense-1B addresses the need for accurate yet resource-efficient phishing detection by fine-tuning a large pre-trained language model with Low-Rank Adaptation (LoRA) and a GuardReasoner-inspired base. The two-stage approach separates broad reasoning in the base model from phishing-specific pattern recognition in a lightweight LoRA adapter, enabling high recall and practical deployment. Empirical results show near-perfect recall and strong performance on a custom dataset (0.975 accuracy) and robust results on RealDaten (0.70 accuracy, 0.90 recall), outperforming unadapted models and BERT-based detectors. The work discusses deployment implications, including browser-extension integrations, and outlines future directions for explainability, multilingual support, and continuous online adaptation to counter evolving phishing threats.
Abstract
Phishing is a persistent cybersecurity threat in today's digital landscape. This paper introduces Phishsense-1B, a refined version of the Llama-Guard-3-1B model, specifically tailored for phishing detection and reasoning. This adaptation utilizes Low-Rank Adaptation (LoRA) and the GuardReasoner finetuning methodology. We outline our LoRA-based fine-tuning process, describe the balanced dataset comprising phishing and benign emails, and highlight significant performance improvements over the original model. Our findings indicate that Phishsense-1B achieves an impressive 97.5% accuracy on a custom dataset and maintains strong performance with 70% accuracy on a challenging real-world dataset. This performance notably surpasses both unadapted models and BERT-based detectors. Additionally, we examine current state-of-the-art detection methods, compare prompt-engineering with fine-tuning strategies, and explore potential deployment scenarios.
