Table of Contents
Fetching ...

FaaS Is Not Enough: Serverless Handling of Burst-Parallel Jobs

Daniel Barcelona-Pons, Aitor Arjona, Pedro García-López, Enrique Molina-Giménez, Stepan Klymonchuk

TL;DR

This work identifies fundamental limitations of Function-as-a-Service for burst-parallel workloads, chiefly the lack of group awareness and locality. It introduces burst computing, which adds a group invocation primitive (flare) and worker packing to launch and coordinate large worker groups in shared environments, significantly reducing startup latency and enabling synchronous, locality-driven communication. A complete prototype built on OpenWhisk includes a Rust runtime and a Burst Communication Middleware (BCM) that supports zero-copy in-pack messaging and scalable inter-pack communication via multiple backends, including Redis and DragonflyDB. Evaluation on real workloads such as PageRank, TeraSort, and hyperparameter tuning demonstrates substantial improvements in invocation latency, worker simultaneity, and data movement, yielding up to thirteenfold speed-ups and up to 98% reductions in remote traffic. Overall, burst computing demonstrates a practical path toward a job-centric, locality-aware serverless paradigm capable of handling burst-parallel applications previously deemed infeasible on FaaS.

Abstract

Function-as-a-Service (FaaS) struggles with burst-parallel jobs due to needing multiple independent invocations to start a job. The lack of a group invocation primitive complicates application development and overlooks crucial aspects like locality and worker communication. We introduce a new serverless solution designed specifically for burst-parallel jobs. Unlike FaaS, our solution ensures job-level isolation using a group invocation primitive, allowing large groups of workers to be launched simultaneously. This method optimizes resource allocation by consolidating workers into fewer containers, speeding up their initialization and enhancing locality. Enhanced locality drastically reduces remote communication compared to FaaS, and combined with simultaneity, it enables workers to communicate synchronously via message passing and group collectives. This makes applications that are impractical with FaaS feasible. We implemented our solution on OpenWhisk, providing a communication middleware that efficiently uses locality with zero-copy messaging. Evaluations show that it reduces job invocation and communication latency, resulting in a 2$\times$ speed-up for TeraSort and a 98.5% reduction in remote communication for PageRank (13$\times$ speed-up) compared to traditional FaaS.

FaaS Is Not Enough: Serverless Handling of Burst-Parallel Jobs

TL;DR

This work identifies fundamental limitations of Function-as-a-Service for burst-parallel workloads, chiefly the lack of group awareness and locality. It introduces burst computing, which adds a group invocation primitive (flare) and worker packing to launch and coordinate large worker groups in shared environments, significantly reducing startup latency and enabling synchronous, locality-driven communication. A complete prototype built on OpenWhisk includes a Rust runtime and a Burst Communication Middleware (BCM) that supports zero-copy in-pack messaging and scalable inter-pack communication via multiple backends, including Redis and DragonflyDB. Evaluation on real workloads such as PageRank, TeraSort, and hyperparameter tuning demonstrates substantial improvements in invocation latency, worker simultaneity, and data movement, yielding up to thirteenfold speed-ups and up to 98% reductions in remote traffic. Overall, burst computing demonstrates a practical path toward a job-centric, locality-aware serverless paradigm capable of handling burst-parallel applications previously deemed infeasible on FaaS.

Abstract

Function-as-a-Service (FaaS) struggles with burst-parallel jobs due to needing multiple independent invocations to start a job. The lack of a group invocation primitive complicates application development and overlooks crucial aspects like locality and worker communication. We introduce a new serverless solution designed specifically for burst-parallel jobs. Unlike FaaS, our solution ensures job-level isolation using a group invocation primitive, allowing large groups of workers to be launched simultaneously. This method optimizes resource allocation by consolidating workers into fewer containers, speeding up their initialization and enhancing locality. Enhanced locality drastically reduces remote communication compared to FaaS, and combined with simultaneity, it enables workers to communicate synchronously via message passing and group collectives. This makes applications that are impractical with FaaS feasible. We implemented our solution on OpenWhisk, providing a communication middleware that efficiently uses locality with zero-copy messaging. Evaluations show that it reduces job invocation and communication latency, resulting in a 2 speed-up for TeraSort and a 98.5% reduction in remote communication for PageRank (13 speed-up) compared to traditional FaaS.
Paper Structure (44 sections, 11 figures, 4 tables, 1 algorithm)

This paper contains 44 sections, 11 figures, 4 tables, 1 algorithm.

Figures (11)

  • Figure 1: CDF of FaaS function start-up time (cold start) in AWS Lambda for 100 and 1000 function invocations on two memory configurations.
  • Figure 2: Running a data processing job of 6 workers in FaaS and burst computing with granularity 3.
  • Figure 3: Timeline of a parallel job in FaaS and burst computing approaches.
  • Figure 4: Burst computing platform overview.
  • Figure 5: Burst start-up time for different packing granularity (worker latency distribution). Left and right show, respectively, burst sizes of 48 and 960.
  • ...and 6 more figures