Table of Contents
Fetching ...

An Upper Bound on the M/M/k Queue With Deterministic Setup Times

Jalani Williams, Weina Wang, Mor Harchol-Balter

TL;DR

This work analyzes a multiserver queue withDeterministic server setup times (M/M/k/Setup-Deterministic), deriving the first closed-form, multiplicatively tight upper and lower bounds on the average queue length and providing a simple, accurate approximation. The authors introduce the Method of Intervening Stopping Times (MIST) to bound random time integrals by partitioning time into strategically chosen stopping intervals, enabling rigorous martingale-based analysis in a high-dimensional state space. The results demonstrate that large relative setup times dramatically increase waiting compared to models with exponential setups and provide practical insights for capacity provisioning; the approximation aligns closely with simulation across many parameter regimes and remains informative near critical load. The work also discusses mitigation strategies (policy-related and design implications) and outlines directions for extending the approach to other setup-time distributions and tail performance metrics.

Abstract

In many systems, servers do not turn on instantly; instead, a setup time must pass before a server can begin work. These "setup times" can wreak havoc on a system's queueing; this is especially true in modern systems, where servers are regularly turned on and off as a way to reduce operating costs (energy, labor, CO2, etc.). To design modern systems which are both efficient and performant, we need to understand how setup times affect queues. Unfortunately, despite successes in understanding setup in a single-server system, setup in a multiserver system remains poorly understood. To circumvent the main difficulty in analyzing multiserver setup, all existing results assume that setup times are memoryless, i.e. distributed Exponentially. However, in most practical settings, setup times are close to Deterministic, and the widely used Exponential-setup assumption leads to unrealistic model behavior and a dramatic underestimation of the true harm caused by setup times. This paper provides a comprehensive characterization of the average waiting time in a multiserver system with Deterministic setup times, the M/M/k/Setup-Deterministic. In particular, we derive upper and lower bounds on the average waiting time in this system, and show these bounds are within a multiplicative constant of each other. These bounds are the first closed-form characterization of waiting time in any finite-server system with setup times. Further, we demonstrate how to combine our upper and lower bounds to derive a simple and accurate approximation for the average waiting time. These results are all made possible via a new technique for analyzing random time integrals that we named the Method of Intervening Stopping Times, or MIST.

An Upper Bound on the M/M/k Queue With Deterministic Setup Times

TL;DR

This work analyzes a multiserver queue withDeterministic server setup times (M/M/k/Setup-Deterministic), deriving the first closed-form, multiplicatively tight upper and lower bounds on the average queue length and providing a simple, accurate approximation. The authors introduce the Method of Intervening Stopping Times (MIST) to bound random time integrals by partitioning time into strategically chosen stopping intervals, enabling rigorous martingale-based analysis in a high-dimensional state space. The results demonstrate that large relative setup times dramatically increase waiting compared to models with exponential setups and provide practical insights for capacity provisioning; the approximation aligns closely with simulation across many parameter regimes and remains informative near critical load. The work also discusses mitigation strategies (policy-related and design implications) and outlines directions for extending the approach to other setup-time distributions and tail performance metrics.

Abstract

In many systems, servers do not turn on instantly; instead, a setup time must pass before a server can begin work. These "setup times" can wreak havoc on a system's queueing; this is especially true in modern systems, where servers are regularly turned on and off as a way to reduce operating costs (energy, labor, CO2, etc.). To design modern systems which are both efficient and performant, we need to understand how setup times affect queues. Unfortunately, despite successes in understanding setup in a single-server system, setup in a multiserver system remains poorly understood. To circumvent the main difficulty in analyzing multiserver setup, all existing results assume that setup times are memoryless, i.e. distributed Exponentially. However, in most practical settings, setup times are close to Deterministic, and the widely used Exponential-setup assumption leads to unrealistic model behavior and a dramatic underestimation of the true harm caused by setup times. This paper provides a comprehensive characterization of the average waiting time in a multiserver system with Deterministic setup times, the M/M/k/Setup-Deterministic. In particular, we derive upper and lower bounds on the average waiting time in this system, and show these bounds are within a multiplicative constant of each other. These bounds are the first closed-form characterization of waiting time in any finite-server system with setup times. Further, we demonstrate how to combine our upper and lower bounds to derive a simple and accurate approximation for the average waiting time. These results are all made possible via a new technique for analyzing random time integrals that we named the Method of Intervening Stopping Times, or MIST.

Paper Structure

This paper contains 248 sections, 13 theorems, 333 equations, 9 figures.

Key Result

Theorem 1

For an M/M/k/Setup-Deterministic with an offered load $R \triangleq k \rho \geq 100$ and a setup time $\beta \geq 100 \frac{1}{\mu}$, the expected number of jobs in queue in steady state is upper-bounded as where the function $g(x,y,z) \triangleq x \frac{1}{2\mu z} + y\left[\frac{R}{\mu z^2} + \frac{3}{2 \mu z} \right]$ and the constant $L_1 = \frac{2}{3}\sqrt{\frac{\pi}{2}}$.

Figures (9)

  • Figure 1: Simulation results for the M/M/k/Setup-Deterministic, M/M/k/Setup-Exponential, M/M/k (no setup), with mean service time $\frac{1}{\mu}=1$ ms, mean setup time $\beta = 1000$ ms, and load kept at a constant $\rho = 0.5$. Note the high separation between the Exponential and Deterministic models at large scales.
  • Figure 2: Our theoretical results along with simulation data for the M/M/k/Setup-Deterministic and M/M/k/Setup-Exponential, varying the number of servers $k$ while keeping the mean service time $\frac{1}{\mu} =1$ ms, the mean setup time $\beta = 1000$ ms, and the load $\rho = 0.5$ fixed. (a) A comparison of our results to the true average waiting time in the M/M/k/Setup-Deterministic. Our results behave like the true average waiting time, while the Exponential model behaves differently. (b) A provisioning example highlighting the differences between the Deterministic and Exponential models. To achieve a target waiting time of $20$ ms, our approximation correctly predicts it will take $k\approx 2000$ servers, while the Exponential model predicts that only $k\approx50$ servers should suffice. See Figure \ref{['fig:approxEval']} for a more comprehensive evaluation.
  • Figure 3: An example of M/M/k/Setup-Deterministic with $k=4$. The state pictured has $Z(t)=2$ busy servers, which means there are $2$ jobs in service. There is $Q(t)=1$ job in queue, and thus $N(t)=Z(t)+Q(t)=4$ jobs in system.
  • Figure 4: Simulation results demonstrating the high accuracy of Approximation \ref{['apx:only']}. For each of these 9 plots, we plot the behavior of the average waiting time (in ms) as one varies the load $\rho$ from $0$ to $1$, holding fixed the total number of servers $k$, the mean service time $\frac{1}{\mu} = 1$ ms, as well as the setup time $\beta$. In each row, we hold the number of servers $k$ constant while testing increasing values of the setup times $\beta$. In each column, we hold the setup time $\beta$ constant while increasing the number of servers. We plot three quantities: 1) in black, the simulated average waiting time for the M/M/k/Setup-Deterministic; 2)in purple, the predicted average waiting time of Approximation \ref{['apx:only']}; and 3) the predicted average waiting time of as given by the "low R" approximation of \ref{['eq:ss_approx']}, a variation on the single-server setup result of welchfirstservice. We also include, as a reference, a dotted line illustrating the point at which the offered load $R\triangleq k\rho = 1$. Our approximation works well when the average number of busy servers $R > 1$, and the "low R" approximation works well when $R < 1$.
  • Figure 5: A depiction of our decomposition of a renewal cycle into an accumulating phase and draining phase, described in Section \ref{['sub:init']}. During the accumulating phase, the departure rate $\mu Z(t) \leq \mu R = k \lambda$, so that the system is transiently unstable and a queue accumulates. During the draining phase, the departure rate $\mu Z(t) > k \lambda$, so that the queue drains.
  • ...and 4 more figures

Theorems & Definitions (46)

  • Theorem 1: Upper Bound on Average Queue Length
  • Theorem 2: Improved Lower Bound on Average Queue Length
  • Lemma 1: Upper Bound on Integral Over Accumulation Period
  • Lemma 2: Upper Bound on Integral Over Draining Period
  • Lemma 3: Lower Bound on Cycle Length
  • Lemma 4: Intervening Stopping Time Lemma
  • Claim 1: Basic Coupling
  • proof
  • Claim 2: Coupling Integral Bound
  • proof
  • ...and 36 more