"Are Adversarial Phishing Webpages a Threat in Reality?" Understanding the Users' Perception of Adversarial Webpages
Ying Yuan, Qingying Hao, Giovanni Apruzzese, Mauro Conti, Gang Wang
TL;DR
This work investigates whether adversarial phishing webpages that evade ML powered detectors also trick real users. It deploys two user studies with $n=470$ participants, comparing unperturbed phishing, legitimate pages, and both lab-crafted APWs and real-world APWs. Results show that several adversarial perturbations can deceive users nearly as effectively as unperturbed phishing, though typos and some wild perturbations are more detectable, while familiarity with a brand can reduce detection accuracy due to overconfidence. The study provides actionable insights for security practitioners and emphasizes the necessity of user-centric evaluation when assessing evasion attacks on phishing detectors, offering open resources to benchmark future work.
Abstract
Machine learning based phishing website detectors (ML-PWD) are a critical part of today's anti-phishing solutions in operation. Unfortunately, ML-PWD are prone to adversarial evasions, evidenced by both academic studies and analyses of real-world adversarial phishing webpages. However, existing works mostly focused on assessing adversarial phishing webpages against ML-PWD, while neglecting a crucial aspect: investigating whether they can deceive the actual target of phishing -- the end users. In this paper, we fill this gap by conducting two user studies (n=470) to examine how human users perceive adversarial phishing webpages, spanning both synthetically crafted ones (which we create by evading a state-of-the-art ML-PWD) as well as real adversarial webpages (taken from the wild Web) that bypassed a production-grade ML-PWD. Our findings confirm that adversarial phishing is a threat to both users and ML-PWD, since most adversarial phishing webpages have comparable effectiveness on users w.r.t. unperturbed ones. However, not all adversarial perturbations are equally effective. For example, those with added typos are significantly more noticeable to users, who tend to overlook perturbations of higher visual magnitude (such as replacing the background). We also show that users' self-reported frequency of visiting a brand's website has a statistically negative correlation with their phishing detection accuracy, which is likely caused by overconfidence. We release our resources.
