Table of Contents
Fetching ...

PassTSL: Modeling Human-Created Passwords through Two-Stage Learning

Yangde Wang, Haozhang Li, Weidong Qiu, Shujun Li, Peng Tang

TL;DR

PassTSL presents a transformer-based two-stage learning framework for modeling human-created passwords, combining pretraining on a large, diverse password corpus with finetuning on target datasets to capture universal and dataset-specific distributions. The model maximizes the likelihood $L(oldsymbol{x}) = abla abla$ (see expression in text) over character sequences, enabling both effective password guessing and a real-time password strength meter. Across six large leaked databases, PassTSL outperforms five SOTA methods by $4.11\%$ to $64.69\%$ at the maximum point, and finetuning with as little as $0.1\%$ additional data yields notable gains. A lightweight, compressed PassTSL-based PSM demonstrates favorable safety/unsafe-error trade-offs compared with FLA and zxcvbn, highlighting practical applicability in browsers. The work also introduces a Jensen-Shannon divergence-guided strategy to select finetuning data, illustrating a data-efficient path to better target-specific password modeling and cracking.

Abstract

Textual passwords are still the most widely used user authentication mechanism. Due to the close connections between textual passwords and natural languages, advanced technologies in natural language processing (NLP) and machine learning (ML) could be used to model passwords for different purposes such as studying human password-creation behaviors and developing more advanced password cracking methods for informing better defence mechanisms. In this paper, we propose PassTSL (modeling human-created Passwords through Two-Stage Learning), inspired by the popular pretraining-finetuning framework in NLP and deep learning (DL). We report how different pretraining settings affected PassTSL and proved its effectiveness by applying it to six large leaked password databases. Experimental results showed that it outperforms five state-of-the-art (SOTA) password cracking methods on password guessing by a significant margin ranging from 4.11% to 64.69% at the maximum point. Based on PassTSL, we also implemented a password strength meter (PSM), and our experiments showed that it was able to estimate password strength more accurately, causing fewer unsafe errors (overestimating the password strength) than two other SOTA PSMs when they produce the same rate of safe errors (underestimating the password strength): a neural-network based method and zxcvbn. Furthermore, we explored multiple finetuning settings, and our evaluations showed that, even a small amount of additional training data, e.g., only 0.1% of the pretrained data, can lead to over 3% improvement in password guessing on average. We also proposed a heuristic approach to selecting finetuning passwords based on JS (Jensen-Shannon) divergence and experimental results validated its usefulness. In summary, our contributions demonstrate the potential and feasibility of applying advanced NLP and ML methods to password modeling and cracking.

PassTSL: Modeling Human-Created Passwords through Two-Stage Learning

TL;DR

PassTSL presents a transformer-based two-stage learning framework for modeling human-created passwords, combining pretraining on a large, diverse password corpus with finetuning on target datasets to capture universal and dataset-specific distributions. The model maximizes the likelihood (see expression in text) over character sequences, enabling both effective password guessing and a real-time password strength meter. Across six large leaked databases, PassTSL outperforms five SOTA methods by to at the maximum point, and finetuning with as little as additional data yields notable gains. A lightweight, compressed PassTSL-based PSM demonstrates favorable safety/unsafe-error trade-offs compared with FLA and zxcvbn, highlighting practical applicability in browsers. The work also introduces a Jensen-Shannon divergence-guided strategy to select finetuning data, illustrating a data-efficient path to better target-specific password modeling and cracking.

Abstract

Textual passwords are still the most widely used user authentication mechanism. Due to the close connections between textual passwords and natural languages, advanced technologies in natural language processing (NLP) and machine learning (ML) could be used to model passwords for different purposes such as studying human password-creation behaviors and developing more advanced password cracking methods for informing better defence mechanisms. In this paper, we propose PassTSL (modeling human-created Passwords through Two-Stage Learning), inspired by the popular pretraining-finetuning framework in NLP and deep learning (DL). We report how different pretraining settings affected PassTSL and proved its effectiveness by applying it to six large leaked password databases. Experimental results showed that it outperforms five state-of-the-art (SOTA) password cracking methods on password guessing by a significant margin ranging from 4.11% to 64.69% at the maximum point. Based on PassTSL, we also implemented a password strength meter (PSM), and our experiments showed that it was able to estimate password strength more accurately, causing fewer unsafe errors (overestimating the password strength) than two other SOTA PSMs when they produce the same rate of safe errors (underestimating the password strength): a neural-network based method and zxcvbn. Furthermore, we explored multiple finetuning settings, and our evaluations showed that, even a small amount of additional training data, e.g., only 0.1% of the pretrained data, can lead to over 3% improvement in password guessing on average. We also proposed a heuristic approach to selecting finetuning passwords based on JS (Jensen-Shannon) divergence and experimental results validated its usefulness. In summary, our contributions demonstrate the potential and feasibility of applying advanced NLP and ML methods to password modeling and cracking.
Paper Structure (19 sections, 4 equations, 8 figures, 5 tables)

This paper contains 19 sections, 4 equations, 8 figures, 5 tables.

Figures (8)

  • Figure 1: The overall architecture of PassTSL. The redder a cell's color is, the more likely it would be the next character. Each module is further explained in Section \ref{['section:pretraining-method']}.
  • Figure 2: Various pretraining settings for PassTSL attacking Chinese databases: $\text{PassTSL}_{\text{Small}}^{\text{CSDN\_1M}}$, $\text{PassTSL}_{\text{Base}}^{\text{CSDN\_1M}}$, $\text{PassTSL}_{\text{Small}}^{\text{COMB\_1M}}$, $\text{PassTSL}_{\text{Small}}^{\text{COMB\_100M}}$. The x-axes represent the number of guesses in the log scale. The y-axes show the percentage of correctly guessed passwords.
  • Figure 3: Various pretraining settings for PassTSL attacking English databases: $\text{PassTSL}_{\text{Small}}^{\text{Gmail\_1M}}$, $\text{PassTSL}_{\text{Base}}^{\text{Gmail\_1M}}$, $\text{PassTSL}_{\text{Small}}^{\text{COMB\_1M}}$, $\text{PassTSL}_{\text{Small}}^{\text{COMB\_100M}}$.
  • Figure 4: Guessing performance of PassTSL against other SOTA models: PassTSL($\text{PassTSL}_{\text{Small}}^{\text{COMB\_100M}}$), FLA, 6-gram, Backoff, PCFG$_{\text{Se}}$, PCFG$_{\text{Ori}}$. The x-axes represent guessing numbers in the log scale. We show the performance under a range of guessed passwords used more commonly by other researchers passganbackoffPassgptflasepcfg.
  • Figure 5: Guessing numbers of the three tested PSMs compared against MinGuess. Red cells are important because they indicate how many passwords are overestimated by a PSM (denoted as unsafe errors). Underestimations of strength are shown in blue cells (called safe errors) and accurate estimations are shown in green cells. The chromatic intensity rises with the number of passwords.
  • ...and 3 more figures