SkipPredict: When to Invest in Predictions for Scheduling
Rana Shahout, Michael Mitzenmacher
TL;DR
This work studies how to incorporate prediction costs into queueing-based scheduling, introducing SkipPredict, a two-stage policy for an $M/G/1$ system with cheap one-bit predictions to separate short vs long jobs and expensive predictions to estimate remaining processing times for the long group. It analyzes two cost models—external (predictions cost time but do not affect service) and server-time (prediction processing consumes server resources)—and derives exact mean response-time expressions via the SOAP framework, comparing SkipPredict to FCFS, 1bit, SPRPT, and a DelayPredict baseline. The results show SkipPredict can outperform competing policies, especially when there is a sizable gap between cheap and expensive prediction costs, and when predictions for long jobs yield substantial gains under high load. The paper also explores generalizations to non-fixed costs and offers several SkipPredict variants and practical considerations for deployment, enabling cost-aware prediction usage in scheduling systems.
Abstract
In light of recent work on scheduling with predicted job sizes, we consider the effect of the cost of predictions in queueing systems, removing the assumption in prior research that predictions are external to the system's resources and/or cost-free. In particular, we introduce a novel approach to utilizing predictions, SkipPredict, designed to address their inherent cost. Rather than uniformly applying predictions to all jobs, we propose a tailored approach that categorizes jobs based on their prediction requirements. To achieve this, we employ one-bit "cheap predictions" to classify jobs as either short or long. SkipPredict prioritizes predicted short jobs over long jobs, and for the latter, SkipPredict applies a second round of more detailed "expensive predictions" to approximate Shortest Remaining Processing Time for these jobs. Our analysis takes into account the cost of prediction. We examine the effect of this cost for two distinct models. In the external cost model, predictions are generated by some external method without impacting job service times but incur a cost. In the server time cost model, predictions themselves require server processing time, and are scheduled on the same server as the jobs.
