Password Strength Analysis Through Social Network Data Exposure: A Combined Approach Relying on Data Reconstruction and Generative Models
Maurizio Atzori, Eleonora Calò, Loredana Caruccio, Stefano Cirillo, Giuseppe Polese, Giandomenico Solimando
TL;DR
This work addresses how public social-network data exposure affects password strength and privacy. It proposes sodaadvance, a data-reconstruction tool that integrates with LLMs to generate and evaluate passwords, using the Cumulative Password Strength ($cps$) metric defined over $[0,1]$. Through three pipelines, it demonstrates that LLMs can produce strong, user-data-informed passwords and that data reconstruction improves evaluation quality, with Claude, Gemini, and ChatGPT leading in performance. The study highlights practical implications for password security and privacy, showing both the potential of LLM-assisted evaluation and the need for strong privacy controls and ethical guidelines when leveraging personal data.
Abstract
Although passwords remain the primary defense against unauthorized access, users often tend to use passwords that are easy to remember. This behavior significantly increases security risks, also due to the fact that traditional password strength evaluation methods are often inadequate. In this discussion paper, we present SODA ADVANCE, a data reconstruction tool also designed to enhance evaluation processes related to the password strength. In particular, SODA ADVANCE integrates a specialized module aimed at evaluating password strength by leveraging publicly available data from multiple sources, including social media platforms. Moreover, we investigate the capabilities and risks associated with emerging Large Language Models (LLMs) in evaluating and generating passwords, respectively. Experimental assessments conducted with 100 real users demonstrate that LLMs can generate strong and personalized passwords possibly defined according to user profiles. Additionally, LLMs were shown to be effective in evaluating passwords, especially when they can take into account user profile data.
