Table of Contents
Fetching ...

An infinite server system with packing constraints and ranked servers

Alexander Stolyar

TL;DR

The paper studies real-time placement of multiple Poisson-arriving customer types into an infinite, ranked-server system with general packing constraints, aiming to minimize the rightmost occupied server location $U$ while also reducing the total number of occupied servers $Q$. It introduces two rank-oblivious GRAND variants augmented with First-Fit: GRAND($aZ$)-FF and GRAND($Z^p$)-FF, and proves they are asymptotically optimal in the sense that $U^r(\infty)/r$ converges to the fluid-optimal values $q^{*,a}$ and $q^*$ respectively as the scaling parameter $r$ grows. The authors establish a generic reduction theorem (Theorem \ref{th-reduction-uni}) showing that, if a rank-oblivious algorithm maintains occupancy near a fixed level and empties servers at a sufficient rate, then pairing it with First-Fit keeps $U^r(\infty)/r$ close to that level; this theorem is then applied to GRAND variants to obtain the main results. The work leverages local fluid limits, strong Poisson-process approximations, and Lyapunov-based arguments to connect the micro-dynamics of server configurations to macroscopic occupancy targets, delivering practical algorithms for efficient VM placement in cloud data centers and advancing the theoretical understanding of packing-constrained, ranked-server systems. The results suggest robust, minimally informative placement rules that achieve near-optimal utilization and compact placement of active servers in large-scale settings, with implications for data-center efficiency and capacity planning. Mathematical notation is used throughout to express rates, configurations, and limits, e.g., $U^r(\infty)$, $q^{*,a}$, and $q^*$, with convergence statements like $U^r(\infty)/r \Rightarrow q^{*,a}$ as $r \to \infty$.

Abstract

A service system with multiple types of customers, arriving as Poisson processes, is considered. The system has infinite number of servers, ranked by $1,2,3, \ldots$; a server rank is its ``location." Each customer has an independent exponentially distributed service time, with the mean determined by its type. Multiple customers (possibly of different types) can be placed for service into one server, subject to ``packing'' constraints. Service times of different customers are independent, even if served simultaneously by the same server. The large-scale asymptotic regime is considered, such that the mean number of customers $r$ goes to infinity. We seek algorithms with the underlying objective of minimizing the location (rank) $U$ of the right-most (highest ranked) occupied (non-empty) server. Therefore, this objective seeks to minimize the total number $Q$ of occupied servers {\em and} keep the set of occupied servers as far at the ``left'' as possible, i.e., keep $U$ close to $Q$. In previous work, versions of {\em Greedy Random} (GRAND) algorithm have been shown to asymptotically minimize $Q/r$ as $r\to\infty$. In this paper we show that when these algorithms are combined with the First-Fit rule for ``taking'' empty servers, they asymptotically minimize $U/r$ as well.

An infinite server system with packing constraints and ranked servers

TL;DR

The paper studies real-time placement of multiple Poisson-arriving customer types into an infinite, ranked-server system with general packing constraints, aiming to minimize the rightmost occupied server location while also reducing the total number of occupied servers . It introduces two rank-oblivious GRAND variants augmented with First-Fit: GRAND()-FF and GRAND()-FF, and proves they are asymptotically optimal in the sense that converges to the fluid-optimal values and respectively as the scaling parameter grows. The authors establish a generic reduction theorem (Theorem \ref{th-reduction-uni}) showing that, if a rank-oblivious algorithm maintains occupancy near a fixed level and empties servers at a sufficient rate, then pairing it with First-Fit keeps close to that level; this theorem is then applied to GRAND variants to obtain the main results. The work leverages local fluid limits, strong Poisson-process approximations, and Lyapunov-based arguments to connect the micro-dynamics of server configurations to macroscopic occupancy targets, delivering practical algorithms for efficient VM placement in cloud data centers and advancing the theoretical understanding of packing-constrained, ranked-server systems. The results suggest robust, minimally informative placement rules that achieve near-optimal utilization and compact placement of active servers in large-scale settings, with implications for data-center efficiency and capacity planning. Mathematical notation is used throughout to express rates, configurations, and limits, e.g., , , and , with convergence statements like as .

Abstract

A service system with multiple types of customers, arriving as Poisson processes, is considered. The system has infinite number of servers, ranked by ; a server rank is its ``location." Each customer has an independent exponentially distributed service time, with the mean determined by its type. Multiple customers (possibly of different types) can be placed for service into one server, subject to ``packing'' constraints. Service times of different customers are independent, even if served simultaneously by the same server. The large-scale asymptotic regime is considered, such that the mean number of customers goes to infinity. We seek algorithms with the underlying objective of minimizing the location (rank) of the right-most (highest ranked) occupied (non-empty) server. Therefore, this objective seeks to minimize the total number of occupied servers {\em and} keep the set of occupied servers as far at the ``left'' as possible, i.e., keep close to . In previous work, versions of {\em Greedy Random} (GRAND) algorithm have been shown to asymptotically minimize as . In this paper we show that when these algorithms are combined with the First-Fit rule for ``taking'' empty servers, they asymptotically minimize as well.
Paper Structure (16 sections, 8 theorems, 63 equations)

This paper contains 16 sections, 8 theorems, 63 equations.

Key Result

Proposition 3

(i) For a fixed $a>0$, consider a sequence of systems under the GRAND($a Z$) algorithm, indexed by $r\to\infty$. Then, $\boldsymbol{x}^r(\infty) \Rightarrow \boldsymbol{x}^{*,a}$; in particular, $q^r(\infty) \Rightarrow q^{*,a}$. (ii) As $a\downarrow 0$, $\boldsymbol{x}^{*,a} \to {\cal X}^*$; in par

Theorems & Definitions (14)

  • Remark 1
  • Remark 2
  • Definition 1: GRAND($a Z$)-FF algorithm
  • Definition 2: GRAND($Z^p$)-FF algorithm
  • Proposition 3: From theorems 3 and 4 in StZh2013
  • Theorem 4
  • Proposition 5: From theorem 1 in StZh2015
  • Theorem 6
  • Theorem 7
  • proof : Proof of Theorem \ref{['th-reduction-uni']}
  • ...and 4 more