Improving Upon the generalized c-mu rule: a Whittle approach
Zhouzi Li, Keerthana Gurushankar, Mor Harchol-Balter, Alan Scheller-Wolf
TL;DR
This work revisits the TVHC scheduling problem where holding costs grow with job age, a setting where the classic generalized $c\mu$ rule is not universally optimal. The authors translate the problem into a discounted Restless Multi-Armed Bandit with a finite number of arms and derive a novel Whittle-index policy that accounts for arrivals via a threshold-based guess. They prove indexability and obtain a closed-form Whittle index: $W_i(t)=\mu_i \mathbb{E}[c_i(t+X)]$, with $X\sim\mathrm{Exp}(\mu_i-\lambda_i)$, and show that the index converges to the known static case when $\lambda_i=0$, aligning with prior Whittle-type results. Simulations demonstrate the policy consistently outperforms existing heuristics across diverse cost functions and system loads, while also recognizing diffusion-limit optimality and limitations in general optimality.
Abstract
Scheduling a stream of jobs whose holding cost changes over time is a classic and practical problem. Specifically, each job is associated with a holding cost (penalty), where a job's instantaneous holding cost is some increasing function of its class and current age (the time it has spent in the system since its arrival). The goal is to schedule the jobs to minimize the time-average total holding cost across all jobs. The seminal paper on this problem, by Van Mieghem in 1995, introduced the generalized c-mu rule for scheduling jobs. Since then, this problem has attracted significant interest but remains challenging due to the absence of a finite-dimensional state space formulation. Consequently, subsequent works focus on more tractable versions of this problem. This paper returns to the original problem, deriving a heuristic that empirically improves upon the generalized c-mu rule and all existing heuristics. Our approach is to first translate the holding cost minimization problem to a novel Restless Multi-Armed Bandit (R-MAB) problem with a finite number of arms. Based on our R-MAB, we derive a novel Whittle Index policy, which is both elegant and intuitive.
