Table of Contents
Fetching ...

$PD^3F$: A Pluggable and Dynamic DoS-Defense Framework Against Resource Consumption Attacks Targeting Large Language Models

Yuanhe Zhang, Xinyue Wang, Haoran Gao, Zhenhong Zhou, Fanyu Meng, Yuyao Zhang, Sen Su

TL;DR

The Pluggable and Dynamic DoS-Defense Framework is proposed, which employs a two-stage approach to defend against resource consumption attacks from both the input and output sides, and introduces the Adaptive End-Based Suppression mechanism, which terminates excessive malicious generation early.

Abstract

Large Language Models (LLMs), due to substantial computational requirements, are vulnerable to resource consumption attacks, which can severely degrade server performance or even cause crashes, as demonstrated by denial-of-service (DoS) attacks designed for LLMs. However, existing works lack mitigation strategies against such threats, resulting in unresolved security risks for real-world LLM deployments. To this end, we propose the Pluggable and Dynamic DoS-Defense Framework ($PD^3F$), which employs a two-stage approach to defend against resource consumption attacks from both the input and output sides. On the input side, we propose the Resource Index to guide Dynamic Request Polling Scheduling, thereby reducing resource usage induced by malicious attacks under high-concurrency scenarios. On the output side, we introduce the Adaptive End-Based Suppression mechanism, which terminates excessive malicious generation early. Experiments across six models demonstrate that $PD^3F$ significantly mitigates resource consumption attacks, improving users' access capacity by up to 500% during adversarial load. $PD^3F$ represents a step toward the resilient and resource-aware deployment of LLMs against resource consumption attacks.

$PD^3F$: A Pluggable and Dynamic DoS-Defense Framework Against Resource Consumption Attacks Targeting Large Language Models

TL;DR

The Pluggable and Dynamic DoS-Defense Framework is proposed, which employs a two-stage approach to defend against resource consumption attacks from both the input and output sides, and introduces the Adaptive End-Based Suppression mechanism, which terminates excessive malicious generation early.

Abstract

Large Language Models (LLMs), due to substantial computational requirements, are vulnerable to resource consumption attacks, which can severely degrade server performance or even cause crashes, as demonstrated by denial-of-service (DoS) attacks designed for LLMs. However, existing works lack mitigation strategies against such threats, resulting in unresolved security risks for real-world LLM deployments. To this end, we propose the Pluggable and Dynamic DoS-Defense Framework (), which employs a two-stage approach to defend against resource consumption attacks from both the input and output sides. On the input side, we propose the Resource Index to guide Dynamic Request Polling Scheduling, thereby reducing resource usage induced by malicious attacks under high-concurrency scenarios. On the output side, we introduce the Adaptive End-Based Suppression mechanism, which terminates excessive malicious generation early. Experiments across six models demonstrate that significantly mitigates resource consumption attacks, improving users' access capacity by up to 500% during adversarial load. represents a step toward the resilient and resource-aware deployment of LLMs against resource consumption attacks.

Paper Structure

This paper contains 59 sections, 19 equations, 33 figures, 11 tables.

Figures (33)

  • Figure 1: This Figure illustrates the defense effect of PD3F against resource consumption attacks.
  • Figure 2: The PD3F mitigation pipeline for resource consumption attacks consists of three stages: (1) request clustering based on a computed Resource Index; (2) dynamic scheduling and reordering of request queues; and (3) elastic output-length suppression to limit resource usage induced by adversarial prompts.
  • Figure 3: Difference between benign and attack requests under the Resource Index on the Llama70B model.
  • Figure 4: The improvement of PD3F in benign user throughput (BUT) indicates stronger resistance to attacks, while the reduction in total tokens (TT) reflects decreased overall resource consumption.
  • Figure 5: This figure shows the changes in BUT for PD3F under varying numbers of requests and users. The main experimental parameters were carefully selected to ensure result stability.
  • ...and 28 more figures