Table of Contents
Fetching ...

Improving Upon the generalized c-mu rule: a Whittle approach

Zhouzi Li, Keerthana Gurushankar, Mor Harchol-Balter, Alan Scheller-Wolf

TL;DR

This work revisits the TVHC scheduling problem where holding costs grow with job age, a setting where the classic generalized $c\mu$ rule is not universally optimal. The authors translate the problem into a discounted Restless Multi-Armed Bandit with a finite number of arms and derive a novel Whittle-index policy that accounts for arrivals via a threshold-based guess. They prove indexability and obtain a closed-form Whittle index: $W_i(t)=\mu_i \mathbb{E}[c_i(t+X)]$, with $X\sim\mathrm{Exp}(\mu_i-\lambda_i)$, and show that the index converges to the known static case when $\lambda_i=0$, aligning with prior Whittle-type results. Simulations demonstrate the policy consistently outperforms existing heuristics across diverse cost functions and system loads, while also recognizing diffusion-limit optimality and limitations in general optimality.

Abstract

Scheduling a stream of jobs whose holding cost changes over time is a classic and practical problem. Specifically, each job is associated with a holding cost (penalty), where a job's instantaneous holding cost is some increasing function of its class and current age (the time it has spent in the system since its arrival). The goal is to schedule the jobs to minimize the time-average total holding cost across all jobs. The seminal paper on this problem, by Van Mieghem in 1995, introduced the generalized c-mu rule for scheduling jobs. Since then, this problem has attracted significant interest but remains challenging due to the absence of a finite-dimensional state space formulation. Consequently, subsequent works focus on more tractable versions of this problem. This paper returns to the original problem, deriving a heuristic that empirically improves upon the generalized c-mu rule and all existing heuristics. Our approach is to first translate the holding cost minimization problem to a novel Restless Multi-Armed Bandit (R-MAB) problem with a finite number of arms. Based on our R-MAB, we derive a novel Whittle Index policy, which is both elegant and intuitive.

Improving Upon the generalized c-mu rule: a Whittle approach

TL;DR

This work revisits the TVHC scheduling problem where holding costs grow with job age, a setting where the classic generalized rule is not universally optimal. The authors translate the problem into a discounted Restless Multi-Armed Bandit with a finite number of arms and derive a novel Whittle-index policy that accounts for arrivals via a threshold-based guess. They prove indexability and obtain a closed-form Whittle index: , with , and show that the index converges to the known static case when , aligning with prior Whittle-type results. Simulations demonstrate the policy consistently outperforms existing heuristics across diverse cost functions and system loads, while also recognizing diffusion-limit optimality and limitations in general optimality.

Abstract

Scheduling a stream of jobs whose holding cost changes over time is a classic and practical problem. Specifically, each job is associated with a holding cost (penalty), where a job's instantaneous holding cost is some increasing function of its class and current age (the time it has spent in the system since its arrival). The goal is to schedule the jobs to minimize the time-average total holding cost across all jobs. The seminal paper on this problem, by Van Mieghem in 1995, introduced the generalized c-mu rule for scheduling jobs. Since then, this problem has attracted significant interest but remains challenging due to the absence of a finite-dimensional state space formulation. Consequently, subsequent works focus on more tractable versions of this problem. This paper returns to the original problem, deriving a heuristic that empirically improves upon the generalized c-mu rule and all existing heuristics. Our approach is to first translate the holding cost minimization problem to a novel Restless Multi-Armed Bandit (R-MAB) problem with a finite number of arms. Based on our R-MAB, we derive a novel Whittle Index policy, which is both elegant and intuitive.

Paper Structure

This paper contains 30 sections, 23 theorems, 110 equations, 10 figures, 2 tables.

Key Result

Lemma 3.1

The optimal policy must serve jobs within each type in FCFS order.

Figures (10)

  • Figure 1: Classes with different holding costs.
  • Figure 2: Example: generalized $c\mu$ rule is suboptimal.
  • Figure 3: A road-map to derive the Whittle Index policy for the static version of our problem: The static version is first translated to a R-MAB problem. Then the Whittle Index is derived based on the R-MAB problem.
  • Figure 4: A road-map used to derive the Whittle Index policy for the queue-length holding cost setting.
  • Figure 5: Using Theorem \ref{['thm:bandit']} and Theorem \ref{['thm:index']}, we follow this road-map to derive the Whittle Index policy.
  • ...and 5 more figures

Theorems & Definitions (52)

  • Lemma 3.1: FCFS within each type is optimal
  • proof
  • Theorem 1: Translation to an R-MAB
  • proof : Proof Sketch
  • Lemma 3.2
  • Definition 4.1: $\Pi$
  • Definition 4.2: Indexability
  • Definition 4.3: Whittle's Index
  • Definition 4.4: $Threshold(x)$
  • Definition 4.5: Cost of $Threshold(x)$
  • ...and 42 more