Power-of-$d$ Choices Load Balancing in the Sub-Halfin Whitt Regime
Sushil Mahavir Varma, Francisco Castro, Siva Theja Maguluri
TL;DR
This work analyzes Power-of-$d$ choices routing in a many-server load-balancing system under the sub-Halfin-Whitt regime with arrival rate $\lambda = n - n^{1-\gamma}$, $\gamma\in(0,0.5)$. It develops an iterative state-space collapse framework driven by Lyapunov drift to obtain sharp, high-probability bounds on queue lengths across a broad range of $d$-scalings, revealing phase transitions: zero-delay for $d \ge n^{\gamma}\log n$, finite-delay with queue length $m$ when $d = \Theta((n^{\gamma}\log n)^{1/m})$, and infinite delay for polylogarithmic $d$. By linking the stochastic steady-state to the fixed point of a mean-field ODE, the paper characterizes the dominant term $s_i/n \approx (\lambda/n)^{(d^i-1)/(d-1)}$ and provides matching upper and lower bounds up to lower-order corrections. Simulations corroborate the fixed-point predictions and illustrate the phase transitions. The approach sidesteps Stein’s method, using iterative SSC with Lyapunov drift to achieve a comprehensive, rigorous understanding of Power-of-$d$ dynamics in this regime, with implications for selecting $d$ in large-scale load-balancing systems.
Abstract
We consider the load balancing system under Poisson arrivals, exponential services, and homogeneous servers. Upon arrival, a job is to be routed to one of the servers, where it is queued until service. We consider the Power-of-$d$ choices routing algorithm, which chooses the queue with minimum length among $d$ randomly sampled queues. We study this system in the many-server heavy-traffic regime where the number of servers goes to infinity simultaneously when the load approaches the capacity. In particular, we consider a sequence of systems with $n$ servers and the arrival rate is given by $λ=n-n^{1-γ}$ for some $γ\in (0, 0.5)$, known as the sub-Halfin-Whitt regime. It was shown by [Liu Ying (2020)] that under Power-of-$d$ choices routing with $d \geq n^γ\log n$, the queue length behaves similarly to that of JSQ and that there are asymptotically zero queueing delays. The focus of this paper is to characterize the behavior when $d$ is below this threshold. We obtain high probability bounds on the queue lengths for various values of $d$ and large enough $n$. In particular, we show that when $d$ grows polynomially in $n$ but slower than in [Liu Ying (2020)], i.e., if $d$ is $Θ\left((n^γ\log n)^{1/m})\right)$ for some integer $m>1$, then the asymptotic queue length is $m$ with high probability. Moreover, if $d$ grows polylog in $n$, i.e., slower than any polynomial, but is at least $Ω(\log (n)^3)$, the queue length blows up to infinity asymptotically. We obtain these results by using an iterative state space collapse approach. We first establish a weak state-space collapse (SSC) on the queue lengths. Then, we bootstrap on weak SSC to iteratively narrow down the region of the collapse. After enough steps, this inductive refinement provides the bounds we seek. We establish these sequences of collapse using Lyapunov drift arguments.
