Table of Contents
Fetching ...

Zero-Waiting Load Balancing with Heterogeneous Servers in Heavy Traffic

Xin Liu, Lei Ying

TL;DR

This is the first work to establish delay performance bounds of a load-balancing system with size $N$ and fully heterogeneous servers in heavy traffic and design a sequence of Lyapunov functions to analyze the high-dimensional heterogeneous system without assuming exchangeability and monotonicity.

Abstract

We study the steady-state delay performance of load balancing in large-scale systems with heterogeneous servers in the heavy-traffic regimes. The system consists of $N$ servers, each with a local buffer of size $b-1$, serving jobs in the first-in-first-out (FIFO) order. Jobs arrive according to a Poisson process with rate $λN$, where $λ= 1 - N^{-α}$ for any $α\in (0,1)$. Service times are assumed to be exponentially distributed with fully heterogeneous rates, where the service rate of each server can differ and may scale with the system size $N$. We study a queue length aware and service rate aware load balancing policy, Join-the-Fastest-Shortest-Queue (JFSQ), and demonstrate that it achieves asymptotic zero waiting time and probability under the heavy traffic regimes, including both the Sub-Halfin-Whitt ($α\in (0,0.5)$) and Super-Halfin-Whitt ($α\in [0.5,1)$) regimes. The performance bounds of waiting time and probability explicitly capture the convergence rate w.r.t. the system size $N$ and show the negative effect of server heterogeneity. Our analysis builds on the general framework of Stein's method with iterative state-space peeling, where we design a sequence of Lyapunov functions to analyze the high-dimensional heterogeneous system without assuming exchangeability and monotonicity. Our analysis shows that JFSQ efficiently utilizes servers with higher capacities, and the steady-state system can be coupled with a single-server queue via Stein's method. To the best of our knowledge, this is the first work to establish delay performance bounds of a load-balancing system with size $N$ and fully heterogeneous servers in heavy traffic.

Zero-Waiting Load Balancing with Heterogeneous Servers in Heavy Traffic

TL;DR

This is the first work to establish delay performance bounds of a load-balancing system with size and fully heterogeneous servers in heavy traffic and design a sequence of Lyapunov functions to analyze the high-dimensional heterogeneous system without assuming exchangeability and monotonicity.

Abstract

We study the steady-state delay performance of load balancing in large-scale systems with heterogeneous servers in the heavy-traffic regimes. The system consists of servers, each with a local buffer of size , serving jobs in the first-in-first-out (FIFO) order. Jobs arrive according to a Poisson process with rate , where for any . Service times are assumed to be exponentially distributed with fully heterogeneous rates, where the service rate of each server can differ and may scale with the system size . We study a queue length aware and service rate aware load balancing policy, Join-the-Fastest-Shortest-Queue (JFSQ), and demonstrate that it achieves asymptotic zero waiting time and probability under the heavy traffic regimes, including both the Sub-Halfin-Whitt () and Super-Halfin-Whitt () regimes. The performance bounds of waiting time and probability explicitly capture the convergence rate w.r.t. the system size and show the negative effect of server heterogeneity. Our analysis builds on the general framework of Stein's method with iterative state-space peeling, where we design a sequence of Lyapunov functions to analyze the high-dimensional heterogeneous system without assuming exchangeability and monotonicity. Our analysis shows that JFSQ efficiently utilizes servers with higher capacities, and the steady-state system can be coupled with a single-server queue via Stein's method. To the best of our knowledge, this is the first work to establish delay performance bounds of a load-balancing system with size and fully heterogeneous servers in heavy traffic.

Paper Structure

This paper contains 15 sections, 13 theorems, 124 equations, 4 figures.

Key Result

Theorem 1

Assume the service rates $\{\mu_n\}$ satisfy Assumptions assumption: sub and assumption: super. Then under the JFSQ policy, the waiting time and probability at steady-state have the following performance bounds:

Figures (4)

  • Figure 1: JFSQ Load Balancing in Many-Server Systems.
  • Figure 2: Roadmap for achieving asymptotic zero waiting results.
  • Figure 3: Most capable servers are busy under JFSQ.
  • Figure 4: State-space collapse of Sub and Super Halfin-Whitt regimes under JFSQ.

Theorems & Definitions (20)

  • Theorem 1
  • Theorem 2
  • Lemma 1
  • Lemma 2
  • Lemma 3
  • Lemma 4
  • Lemma 5
  • Remark 1
  • Lemma 5
  • proof
  • ...and 10 more