Convergence Rate Analysis of the Join-the-Shortest-Queue System
Yuanzhe Ma, Siva Theja Maguluri
TL;DR
This paper addresses the finite-time convergence of a two-server Join-the-Shortest-Queue system under JSQ routing. By directly analyzing the original continuous-time Markov chain and employing coupling-based hitting-time techniques, the authors derive a non-asymptotic total-variation bound of order $O\left(\frac{1}{(1-\rho)^3}\frac{1}{t}\right)$ for $\rho<1$, with explicit constants. A refined bound on the auxiliary constant $K(\rho)$ is provided for $\rho<1/\sqrt{2}$, and a corollary gives a mean-queue-length convergence rate of $O\left(\frac{1}{(1-\sqrt{\rho})^5}\frac{1}{\sqrt{t}}\right)$. These results deliver reliable transient performance guarantees without relying on diffusion approximations, and lay groundwork for extending transient analyses to larger or heterogeneous JSQ systems.
Abstract
The Join-the-Shortest-Queue (JSQ) policy is among the most widely used routing algorithms for load balancing systems and has been extensively studied. Despite its simplicity and optimality, exact characterization of the system remains challenging. Most prior research has focused on analyzing its performance in steady-state in certain asymptotic regimes such as the heavy-traffic regime. However, the convergence rate to the steady-state in these regimes is often slow, calling into question the reliability of analyses based solely on the steady-state and heavy-traffic approximations. To address this limitation, we provide a finite-time convergence rate analysis of a JSQ system with two symmetric servers. In sharp contrast to the existing literature, we directly study the original system as opposed to an approximate limiting system such as a diffusion approximation. Our results demonstrate that for such a system, the convergence rate to its steady-state, measured in the total variation distance, is $O \left(\frac{1}{(1-ρ)^3} \frac{1}{t} \right)$, where $ρ\in (0,1)$ is the traffic intensity.
