GenXSS: an AI-Driven Framework for Automated Detection of XSS Attacks in WAFs
Vahid Babaey, Arun Ravindran
TL;DR
GenXSS tackles the challenge of obfuscated XSS attacks by using in-context learning with large language models to generate diverse XSS payloads, validating them in a vulnerable application, and testing their ability to bypass WAFs. It then clusters bypassing payloads and leverages a second LLM to create tailored WAF rules, refined through reinforcement learning with human feedback. Experiments with GPT-4o and Gemini Pro show high payload validity and substantial bypass potential against ModSecurity and AWS WAF, while the generated rules block a large fraction of bypasses, demonstrating a practical, automated defense loop. The framework offers a scalable approach to strengthening WAF defenses against evolving XSS techniques, with future directions including multi-agent prompting, ethical safeguards, and improved anomaly detection.
Abstract
The increasing reliance on web services has led to a rise in cybersecurity threats, particularly Cross-Site Scripting (XSS) attacks, which target client-side layers of web applications by injecting malicious scripts. Traditional Web Application Firewalls (WAFs) struggle to detect highly obfuscated and complex attacks, as their rules require manual updates. This paper presents a novel generative AI framework that leverages Large Language Models (LLMs) to enhance XSS mitigation. The framework achieves two primary objectives: (1) generating sophisticated and syntactically validated XSS payloads using in-context learning, and (2) automating defense mechanisms by testing these attacks against a vulnerable application secured by a WAF, classifying bypassing attacks, and generating effective WAF security rules. Experimental results using GPT-4o demonstrate the framework's effectiveness generating 264 XSS payloads, 83% of which were validated, with 80% bypassing ModSecurity WAF equipped with an industry standard security rule set developed by the Open Web Application Security Project (OWASP) to protect against web vulnerabilities. Through rule generation, 86% of previously successful attacks were blocked using only 15 new rules. In comparison, Google Gemini Pro achieved a lower bypass rate of 63%, highlighting performance differences across LLMs.
