Multiclass Queue Scheduling Under Slowdown: An Approximate Dynamic Programming Approach

Jing Dong; Berk Görgülü; Vahid Sarhangian

Multiclass Queue Scheduling Under Slowdown: An Approximate Dynamic Programming Approach

Jing Dong, Berk Görgülü, Vahid Sarhangian

TL;DR

This work tackles scheduling in multiclass queues with slowdown, where waiting increases future service requirements. It introduces a simulation-based Approximate Dynamic Programming (ADP) framework that learns an index-based policy via classifiers, estimates relative-value function differences with a coupling construction, and uses adaptive sampling to efficiently explore the state space. The approach yields near-optimal policies that outperform standard benchmarks across two-class and multi-class scenarios, and it reveals how optimal decisions balance immediate cost reduction against avoiding high-cost equilibria induced by slowdown. A fluid-model analysis exposes meta-stability and informs initialization through a fluid-based policy iteration, while a case study on rehabilitation admissions demonstrates substantial reductions in waiting times and improved functional outcomes, underscoring the method's practical impact. The combination of coupling, adaptive sampling, and policy-classifier representation offers a scalable, broadly applicable toolkit for complex queue-control problems with wait-dependent service requirements.

Abstract

In many service systems, especially those in healthcare, customer waiting times can result in increased service requirements. Such service slowdowns can significantly impact system performance. Therefore, it is important to properly account for their impact when designing scheduling policies. Scheduling under wait-dependent service times is challenging, especially when multiple customer classes are heterogeneously affected by waiting. In this work, we study scheduling policies in multiclass, multiserver queues with wait-dependent service slowdowns. We propose a simulation-based Approximate Dynamic Programming (ADP) algorithm to find close-to-optimal scheduling policies. The ADP algorithm (i) represents the policy using classifiers based on the index policy structure, (ii) leverages a coupling method to estimate the differences of the relative value functions directly, and (iii) uses adaptive sampling for efficient state-space exploration. Through extensive numerical experiments, we illustrate that the ADP algorithm generates close-to-optimal policies that outperform well-known benchmarks. We also provide insights into the structure of the optimal policy, which reveals an important trade-off between instantaneous cost reduction and preventing the system from reaching high-cost equilibria. Lastly, we conduct a case study on scheduling admissions into rehabilitation care to illustrate the effectiveness of the ADP algorithm in practice.

Multiclass Queue Scheduling Under Slowdown: An Approximate Dynamic Programming Approach

TL;DR

Abstract

Paper Structure (32 sections, 3 theorems, 42 equations, 13 figures, 10 tables, 6 algorithms)

This paper contains 32 sections, 3 theorems, 42 equations, 13 figures, 10 tables, 6 algorithms.

Introduction
Literature Review
Model Description
Connection to Wait-Dependent Service Times
Fluid Approximation and Equilibrium Analysis
Approximate Dynamic Programming
Sampling-Based ADP: Approximate Policy Iteration
Estimation of Value Function Differences via Coupling
Adaptive Sampling Approach
Extension to the Non-preemptive Setting
ADP Performance and Structure of the Optimal Policy
The performance of the ADP Algorithm
Worst-case system load.
Service rate heterogeneity.
Arrival rate heterogeneity.
...and 17 more sections

Key Result

Proposition 1

For any initial condition $\Bar{X}(0)$, there exists a solution to the problem defined by eq:fluid_dynamics.

Figures (13)

Figure 1: Vector field of the fluid model and an example of sample paths for the stochastic system under meta-stability with parameters $\lambda_1 = \lambda_2=1.5$, $\mu_1=\mu_2=1$, $a_1=0.03$, $a_2=0.02$, $\kappa_1=\kappa_2= 30$, $C=4$.
Figure 2: A visual illustration of optimal policy and relative value function under the parameters $\lambda = (0.3,0.3), \mu = (0.9,0.9), a = (0.02,0.02), \kappa = (40,40), h = (1,1), b = (0,0), \bar{f}_i(x)= \mu_i - a_i x_i$.
Figure 3: Optimal policy, ADP policy generated when Logistic Regression is used as a generalization method and ADP policy directly obtained for all states with $\lambda = (1.5,1.5)$, $\mu = (1,1)$, $a = (0.01,0.02)$, $\kappa = (30,30)$, $h = (3.5,1)$, $b = (0,0)$, $C=4$, $\bar{f}_i(x)= \mu_i - a_i x_i$.
Figure 4: Distribution of the simulation times using the coupling method versus the regeneration method. (Arrival rates: $\lambda = (1.5,1.5)$; base service rates $\mu = (1,1)$; slowdown rates: $a = (0.0103, 0.0203)$; blocking thresholds: $\kappa = (30,30)$; blocking costs: $b=(0,0)$; holding costs: $h = (3,1)$. Initial states: $(10,10), (15,15), (20,20)$.)
Figure 5: t-statistics on $R_1-R_2$ calculated over 100 replications and the number of replications taken in different states of the sample space for identifying the correct action with a high probability guarantee ($\alpha = 0.95$).
...and 8 more figures

Theorems & Definitions (7)

Proposition 1
Definition 1
Definition 2
Definition 3
Proposition 2
Remark 1
Lemma EC.1

Multiclass Queue Scheduling Under Slowdown: An Approximate Dynamic Programming Approach

TL;DR

Abstract

Multiclass Queue Scheduling Under Slowdown: An Approximate Dynamic Programming Approach

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (13)

Theorems & Definitions (7)