PagPassGPT: Pattern Guided Password Guessing via Generative Pretrained Transformer
Xingyu Su, Xiaojie Zhu, Yang Li, Yong Li, Chi Chen, Paulo Esteves-Veríssimo
TL;DR
This work addresses the challenge of high-quality, pattern-consistent password guessing under large attack budgets. It introduces PagPassGPT, a GPT-2-based generator that conditions password production on explicit pattern information, and D&C-GEN, a divide-and-conquer algorithm that reduces duplicates by partitioning the generation task into non-overlapping subtasks. Together, they achieve higher hit rates and substantially lower repeat rates than prior deep-learning approaches, with notable gains in pattern-guided guessing ($HR$ improvements) and cross-site generalization. The methods offer practical implications for evaluating password strength and understanding attack surfaces, while also highlighting limitations in pattern diversity and computational overhead that warrant further research.
Abstract
Amidst the surge in deep learning-based password guessing models, challenges of generating high-quality passwords and reducing duplicate passwords persist. To address these challenges, we present PagPassGPT, a password guessing model constructed on Generative Pretrained Transformer (GPT). It can perform pattern guided guessing by incorporating pattern structure information as background knowledge, resulting in a significant increase in the hit rate. Furthermore, we propose D&C-GEN to reduce the repeat rate of generated passwords, which adopts the concept of a divide-and-conquer approach. The primary task of guessing passwords is recursively divided into non-overlapping subtasks. Each subtask inherits the knowledge from the parent task and predicts succeeding tokens. In comparison to the state-of-the-art model, our proposed scheme exhibits the capability to correctly guess 12% more passwords while producing 25% fewer duplicates.
