Offensive AI: Enhancing Directory Brute-forcing Attack with the Use of Language Models
Alberto Castagnaro, Mauro Conti, Luca Pajola
TL;DR
This work tackles the inefficiency of traditional directory brute-forcing by introducing Offensive AI, a framework that leverages prior knowledge through probabilistic trees and language-model–based path generation. Using a dataset of one million URLs from universities, hospitals, government, and companies sourced from CommonCrawl, the authors compare standard wordlist-based strategies with two knowledge-infused approaches. The language-model–based attack delivers a substantial average improvement of 969% over baselines, while the probabilistic method offers strong gains under limited budgets, demonstrating the value of embedding-rich context in attack generation. The offline, ethically constrained testbed and comprehensive dataset analyses underscore both the potential of AI-assisted offensive techniques and the need for robust defenses against such capabilities in web vulnerability assessment and pentesting.
Abstract
Web Vulnerability Assessment and Penetration Testing (Web VAPT) is a comprehensive cybersecurity process that uncovers a range of vulnerabilities which, if exploited, could compromise the integrity of web applications. In a VAPT, it is common to perform a \textit{Directory brute-forcing Attack}, aiming at the identification of accessible directories of a target website. Current commercial solutions are inefficient as they are based on brute-forcing strategies that use wordlists, resulting in enormous quantities of trials for a small amount of success. Offensive AI is a recent paradigm that integrates AI-based technologies in cyber attacks. In this work, we explore whether AI can enhance the directory enumeration process and propose a novel Language Model-based framework. Our experiments -- conducted in a testbed consisting of 1 million URLs from different web application domains (universities, hospitals, government, companies) -- demonstrate the superiority of the LM-based attack, with an average performance increase of 969%.
