Table of Contents
Fetching ...

Rethinking HTTP API Rate Limiting: A Client-Side Approach

Behrooz Farkiani, Fan Liu, Patrick Crowley

TL;DR

This work tackles HTTP API rate limiting when independent clients share a common quota, where server-only controls can waste capacity due to lack of visibility into others’ load. It introduces two client-side, noncentralized algorithms—ATB and AATB—that infer system congestion and adapt retry timing, with ATB deployable via a service worker and AATB using telemetry data. The authors provide a MILP formulation of the rate-limiting problem and demonstrate via real and synthetic traces that their methods reduce HTTP 429 errors by up to 97%, with modest increases in completion time, outperforming standard exponential backoff and window-based backoff baselines. The findings suggest practical, low-overhead means to improve service delivery under shared quotas without server cooperation or central coordination, especially under heavy load.

Abstract

HTTP underpins modern Internet services, and providers enforce quotas to regulate HTTP API traffic for scalability and reliability. When requests exceed quotas, clients are throttled and must retry. Server-side enforcement protects the service. However, when independent clients' usage counts toward a shared quota, server-only controls are inefficient; clients lack visibility into others' load, causing their retry attempts to potentially fail. Indeed, retry timing is important since each attempt incurs costs and yields no benefit unless admitted. While centralized coordination could address this, practical limitations have led to widespread adoption of simple client-side strategies like exponential backoff. As we show, these simple strategies cause excessive retries and significant costs. We design adaptive client-side mechanisms requiring no central control, relying only on minimal feedback. We present two algorithms: ATB, an offline method deployable via service workers, and AATB, which enhances retry behavior using aggregated telemetry data. Both algorithms infer system congestion to schedule retries. Through emulations with real-world traces and synthetic datasets with up to 100 clients, we demonstrate that our algorithms reduce HTTP 429 errors by up to 97.3% compared to exponential backoff, while the modest increase in completion time is outweighed by the reduction in errors.

Rethinking HTTP API Rate Limiting: A Client-Side Approach

TL;DR

This work tackles HTTP API rate limiting when independent clients share a common quota, where server-only controls can waste capacity due to lack of visibility into others’ load. It introduces two client-side, noncentralized algorithms—ATB and AATB—that infer system congestion and adapt retry timing, with ATB deployable via a service worker and AATB using telemetry data. The authors provide a MILP formulation of the rate-limiting problem and demonstrate via real and synthetic traces that their methods reduce HTTP 429 errors by up to 97%, with modest increases in completion time, outperforming standard exponential backoff and window-based backoff baselines. The findings suggest practical, low-overhead means to improve service delivery under shared quotas without server cooperation or central coordination, especially under heavy load.

Abstract

HTTP underpins modern Internet services, and providers enforce quotas to regulate HTTP API traffic for scalability and reliability. When requests exceed quotas, clients are throttled and must retry. Server-side enforcement protects the service. However, when independent clients' usage counts toward a shared quota, server-only controls are inefficient; clients lack visibility into others' load, causing their retry attempts to potentially fail. Indeed, retry timing is important since each attempt incurs costs and yields no benefit unless admitted. While centralized coordination could address this, practical limitations have led to widespread adoption of simple client-side strategies like exponential backoff. As we show, these simple strategies cause excessive retries and significant costs. We design adaptive client-side mechanisms requiring no central control, relying only on minimal feedback. We present two algorithms: ATB, an offline method deployable via service workers, and AATB, which enhances retry behavior using aggregated telemetry data. Both algorithms infer system congestion to schedule retries. Through emulations with real-world traces and synthetic datasets with up to 100 clients, we demonstrate that our algorithms reduce HTTP 429 errors by up to 97.3% compared to exponential backoff, while the modest increase in completion time is outweighed by the reduction in errors.

Paper Structure

This paper contains 7 sections, 1 figure, 6 tables.

Figures (1)

  • Figure 1: Evaluation results: (a-c) Real-world trace, (d-f) Five-client scenario, (g-i) One-hundred-client scenario. Standard deviation is shown as error bars.