Crabs: Consuming Resource via Auto-generation for LLM-DoS Attack under Black-box Settings
Yuanhe Zhang, Zhenhong Zhou, Wei Zhang, Xinyue Wang, Xiaojun Jia, Yang Liu, Sen Su
TL;DR
The paper addresses the vulnerability of black-box LLM services to Denial-of-Service attacks by introducing AutoDoS, an automated attack framework that constructs a DoS Attack Tree, expands prompts via depth/backtracking and breadth expansion, and uses a Length Trojan to bypass defenses without modifying model parameters. It combines transferability-driven iterative optimization with an Assist Prompt to achieve cross-model effectiveness, validated across 11 models and multiple datasets, showing massive increases in output length and resource consumption, as well as strong cross-model transfer. The results reveal significant latency and memory impact on LLM services, highlighting the practical feasibility of black-box LLM-DoS and the limitations of current defenses such as input filtering and output monitoring. The work underscores the need for robust, architecture-agnostic defenses against resource-exhaustion attacks and provides a detailed failure analysis and ablations informing defense design.
Abstract
Large Language Models (LLMs) have demonstrated remarkable performance across diverse tasks yet still are vulnerable to external threats, particularly LLM Denial-of-Service (LLM-DoS) attacks. Specifically, LLM-DoS attacks aim to exhaust computational resources and block services. However, existing studies predominantly focus on white-box attacks, leaving black-box scenarios underexplored. In this paper, we introduce Auto-Generation for LLM-DoS (AutoDoS) attack, an automated algorithm designed for black-box LLMs. AutoDoS constructs the DoS Attack Tree and expands the node coverage to achieve effectiveness under black-box conditions. By transferability-driven iterative optimization, AutoDoS could work across different models in one prompt. Furthermore, we reveal that embedding the Length Trojan allows AutoDoS to bypass existing defenses more effectively. Experimental results show that AutoDoS significantly amplifies service response latency by over 250$\times\uparrow$, leading to severe resource consumption in terms of GPU utilization and memory usage. Our work provides a new perspective on LLM-DoS attacks and security defenses. Our code is available at https://github.com/shuita2333/AutoDoS.
