Multiclass Queue Scheduling Under Slowdown: An Approximate Dynamic Programming Approach
Jing Dong, Berk Görgülü, Vahid Sarhangian
TL;DR
This work tackles scheduling in multiclass queues with slowdown, where waiting increases future service requirements. It introduces a simulation-based Approximate Dynamic Programming (ADP) framework that learns an index-based policy via classifiers, estimates relative-value function differences with a coupling construction, and uses adaptive sampling to efficiently explore the state space. The approach yields near-optimal policies that outperform standard benchmarks across two-class and multi-class scenarios, and it reveals how optimal decisions balance immediate cost reduction against avoiding high-cost equilibria induced by slowdown. A fluid-model analysis exposes meta-stability and informs initialization through a fluid-based policy iteration, while a case study on rehabilitation admissions demonstrates substantial reductions in waiting times and improved functional outcomes, underscoring the method's practical impact. The combination of coupling, adaptive sampling, and policy-classifier representation offers a scalable, broadly applicable toolkit for complex queue-control problems with wait-dependent service requirements.
Abstract
In many service systems, especially those in healthcare, customer waiting times can result in increased service requirements. Such service slowdowns can significantly impact system performance. Therefore, it is important to properly account for their impact when designing scheduling policies. Scheduling under wait-dependent service times is challenging, especially when multiple customer classes are heterogeneously affected by waiting. In this work, we study scheduling policies in multiclass, multiserver queues with wait-dependent service slowdowns. We propose a simulation-based Approximate Dynamic Programming (ADP) algorithm to find close-to-optimal scheduling policies. The ADP algorithm (i) represents the policy using classifiers based on the index policy structure, (ii) leverages a coupling method to estimate the differences of the relative value functions directly, and (iii) uses adaptive sampling for efficient state-space exploration. Through extensive numerical experiments, we illustrate that the ADP algorithm generates close-to-optimal policies that outperform well-known benchmarks. We also provide insights into the structure of the optimal policy, which reveals an important trade-off between instantaneous cost reduction and preventing the system from reaching high-cost equilibria. Lastly, we conduct a case study on scheduling admissions into rehabilitation care to illustrate the effectiveness of the ADP algorithm in practice.
