Control of parallel non-observable queues: asymptotic equivalence and optimality of periodic policies
Jonatha Anselmi, Bruno Gaujal, Tommaso Nesti
TL;DR
This work addresses minimizing the stationary mean waiting time in a parallel, non-observable queueing system with deterministic routing under a large-scale replication regime. It introduces a class of periodic policies and proves that, as the replication factor $k$ grows, these policies become asymptotically equivalent and optimal, with the limiting waiting time given by $\mathbb{E} W(p)$, the mean waiting time of a collection of independent $D/GI/1$ queues and convex in the allocation vector $p$. The analysis leverages Loynes' framework and Strassen couplings to establish distributional convergence and, under mild moment conditions, convergence of means and variances; the results imply that the original hard problem can be reduced to a convex optimization over $p$. Practically, this provides a tractable approach for designing dispatch policies in large-scale systems (e.g., cloud or volunteer computing) by selecting optimal proportions $p$ that minimize the convex objective $\mathbb{E} W(p)$.
Abstract
We consider a queueing system composed of a dispatcher that routes deterministically jobs to a set of non-observable queues working in parallel. In this setting, the fundamental problem is which policy should the dispatcher implement to minimize the stationary mean waiting time of the incoming jobs. We present a structural property that holds in the classic scaling of the system where the network demand (arrival rate of jobs) grows proportionally with the number of queues. Assuming that each queue of type $r$ is replicated $k$ times, we consider a set of policies that are periodic with period $k \sum_r p_r$ and such that exactly $p_r$ jobs are sent in a period to each queue of type $r$. When $k\to\infty$, our main result shows that all the policies in this set are equivalent, in the sense that they yield the same mean stationary waiting time, and optimal, in the sense that no other policy having the same aggregate arrival rate to \emph{all} queues of a given type can do better in minimizing the stationary mean waiting time. This property holds in a strong probabilistic sense. Furthermore, the limiting mean waiting time achieved by our policies is a convex function of the arrival rate in each queue, which facilitates the development of a further optimization aimed at solving the fundamental problem above for large systems.
