Table of Contents
Fetching ...

When AI Defeats Password Deception! A Deep Learning Framework to Distinguish Passwords and Honeywords

Jimmy Dani, Brandon McCulloh, Nitesh Saxena

TL;DR

PassFilter presents a CNN-based attack that reframes password-honeyword distinction as a binary classification problem and uses an $\epsilon$-flatness metric to rank sweetwords by their likelihood of being the real password. Trained on breached passwords and adversarial honeywords (via diverse HGTs and GAN/LLM methods), PassFilter outperforms random guessing across same-service, cross-service, and self-trained threat models, achieving high early-attack success rates and near-perfect success with more attempts. The work demonstrates that existing honeyword generation techniques—ranging from heuristic tweaking to representation-learning and language-model-based approaches—remain vulnerable to data-driven attacks, underscoring the need for stronger honeyword design and additional defenses such as 2FA. The findings have practical implications for authentication security, suggesting that current honeyword frameworks may offer limited protection unless complemented by robust generation strategies and operational safeguards. Overall, the paper motivates a reevaluation of honeyword defenses in light of adaptive, DL-based threats.

Abstract

"Honeywords" have emerged as a promising defense mechanism for detecting data breaches and foiling offline dictionary attacks (ODA) by deceiving attackers with false passwords. In this paper, we propose PassFilter, a novel deep learning (DL) based attack framework, fundamental in its ability to identify passwords from a set of sweetwords associated with a user account, effectively challenging a variety of honeywords generation techniques (HGTs). The DL model in PassFilter is trained with a set of previously collected or adversarially generated passwords and honeywords, and carefully orchestrated to predict whether a sweetword is the password or a honeyword. Our model can compromise the security of state-of-the-art, heuristics-based, and representation learning-based HGTs proposed by Dionysiou et al. Specifically, our analysis with nine publicly available password datasets shows that PassFilter significantly outperforms the baseline random guessing success rate of 5%, achieving 6.10% to 52.78% on the 1st guessing attempt, considering 20 sweetwords per account. This success rate rapidly increases with additional login attempts before account lock-outs, often allowed on many real-world online services to maintain reasonable usability. For example, it ranges from 41.78% to 96.80% for five attempts, and from 72.87% to 99.00% for ten attempts, compared to 25% and 50% random guessing, respectively. We also examined PassFilter against general-purpose language models used for honeyword generation, like those proposed by Yu et al. These honeywords also proved vulnerable to our attack, with success rates of 14.19% for 1st guessing attempt, increasing to 30.23%, 41.70%, and 63.10% after 3rd, 5th, and 10th guessing attempts, respectively. Our findings demonstrate the effectiveness of DL model deployed in PassFilter in breaching state-of-the-art HGTs and compromising password security based on ODA.

When AI Defeats Password Deception! A Deep Learning Framework to Distinguish Passwords and Honeywords

TL;DR

PassFilter presents a CNN-based attack that reframes password-honeyword distinction as a binary classification problem and uses an -flatness metric to rank sweetwords by their likelihood of being the real password. Trained on breached passwords and adversarial honeywords (via diverse HGTs and GAN/LLM methods), PassFilter outperforms random guessing across same-service, cross-service, and self-trained threat models, achieving high early-attack success rates and near-perfect success with more attempts. The work demonstrates that existing honeyword generation techniques—ranging from heuristic tweaking to representation-learning and language-model-based approaches—remain vulnerable to data-driven attacks, underscoring the need for stronger honeyword design and additional defenses such as 2FA. The findings have practical implications for authentication security, suggesting that current honeyword frameworks may offer limited protection unless complemented by robust generation strategies and operational safeguards. Overall, the paper motivates a reevaluation of honeyword defenses in light of adaptive, DL-based threats.

Abstract

"Honeywords" have emerged as a promising defense mechanism for detecting data breaches and foiling offline dictionary attacks (ODA) by deceiving attackers with false passwords. In this paper, we propose PassFilter, a novel deep learning (DL) based attack framework, fundamental in its ability to identify passwords from a set of sweetwords associated with a user account, effectively challenging a variety of honeywords generation techniques (HGTs). The DL model in PassFilter is trained with a set of previously collected or adversarially generated passwords and honeywords, and carefully orchestrated to predict whether a sweetword is the password or a honeyword. Our model can compromise the security of state-of-the-art, heuristics-based, and representation learning-based HGTs proposed by Dionysiou et al. Specifically, our analysis with nine publicly available password datasets shows that PassFilter significantly outperforms the baseline random guessing success rate of 5%, achieving 6.10% to 52.78% on the 1st guessing attempt, considering 20 sweetwords per account. This success rate rapidly increases with additional login attempts before account lock-outs, often allowed on many real-world online services to maintain reasonable usability. For example, it ranges from 41.78% to 96.80% for five attempts, and from 72.87% to 99.00% for ten attempts, compared to 25% and 50% random guessing, respectively. We also examined PassFilter against general-purpose language models used for honeyword generation, like those proposed by Yu et al. These honeywords also proved vulnerable to our attack, with success rates of 14.19% for 1st guessing attempt, increasing to 30.23%, 41.70%, and 63.10% after 3rd, 5th, and 10th guessing attempts, respectively. Our findings demonstrate the effectiveness of DL model deployed in PassFilter in breaching state-of-the-art HGTs and compromising password security based on ODA.
Paper Structure (15 sections, 11 figures, 9 tables)

This paper contains 15 sections, 11 figures, 9 tables.

Figures (11)

  • Figure 1: same-service threat model - $\mathcal{A}$ compromises $\mathcal{S}$, and $\mathcal{H}$ at time $\mathcal{T}_{1}$ and acquires labeled dataset. At $\mathcal{T}_{2}$, attacker only compromises $\mathcal{S}$ to obtain the list of sweetwords associated with $\mathcal{U}$ account(s).
  • Figure 2: cross-service threat model - $\mathcal{A}$ breaches web services $\mathcal{S}_{A}$ and $\mathcal{S}_{B}$, using credentials from $\mathcal{S}_{A}$ to train a CNN classifier for identifying $\mathcal{U}$ passwords in $\mathcal{S}_{B}$.
  • Figure 3: self-trained threat model - $\mathcal{A}$ creates a diverse set of passwords using the adversarial model (PassGAN HB19) and generates corresponding honeywords to train a DL classifier.
  • Figure 4: The design of PassFilter to identify passwords from a list of sweetwords associated with a user’s account. In Phase I, a dataset of honeywords corresponding to the passwords is created. In Phase II, both honeywords and passwords are tokenized to train and evaluate a CNN model.
  • Figure 5: The architecture of CNN model used for PassFilter.
  • ...and 6 more figures