Raptor: Distributed Scheduling for Serverless Functions
Kevin Exton, Maria Read
TL;DR
Raptor introduces a distributed speculative-execution scheduler for serverless workloads, integrating with OpenWhisk 2.0 to form flights of parallel function executions that share state via SCTP and utilize POSIX preemption. The approach targets straggler mitigation in warehouse-scale deployments by leveraging OS-level job controls and a forking execution model, achieving substantial end-to-end latency reductions and improved reliability for parallelizable workflows, particularly as horizontal scale and randomness increase. Empirical evaluation on HA OpenWhisk across three availability zones demonstrates that Raptor closely matches theoretical gains when function execution times are mutually independent and exponentially distributed, with notable reductions for RSA key generation, word count, and image thumbnail workflows. The work highlights practical considerations for integrating distributed function executors into existing serverless platforms and points to directions for broader portability, security, and protocol extensibility to realize Raptor’s benefits in diverse cloud environments.
Abstract
To support parallelizable serverless workflows in applications like media processing, we have prototyped a distributed scheduler called Raptor that reduces both the end-to-end delay time and failure rate of parallelizable serverless workflows. As modern serverless frameworks are typically deployed to extremely large scale distributed computing environments by major cloud providers, Raptor is specifically designed to exploit the property of statistically independent function execution that tends to emerge at very large scales. To demonstrate the effect of horizontal scale on function execution, our evaluation demonstrates that mean delay time improvements provided by Raptor for RSA public-private key pair generation can be accurately predicted by mutually independent exponential random variables, but only once the serverless framework is deployed in a highly available configuration and horizontally scaled across three availability zones.
