Table of Contents
Fetching ...

In Serverless, OS Scheduler Choice Costs Money: A Hybrid Scheduling Approach for Cheaper FaaS

Yuxuan Zhao, Weikang Weng, Rob van Nieuwpoort, Alexandru Uta

TL;DR

This article raises awareness and makes a case for rethinking the OS-level scheduling in Linux for serverless workloads composed of many short-lived processes and introduces a hybrid two-level scheduling approach that relies on FaaS characteristics.

Abstract

In Function-as-a-Service (FaaS) serverless, large applications are split into short-lived stateless functions. Deploying functions is mutually profitable: users need not be concerned with resource management, while providers can keep their servers at high utilization rates running thousands of functions concurrently on a single machine. It is exactly this high concurrency that comes at a cost. The standard Linux Completely Fair Scheduler (CFS) switches often between tasks, which leads to prolonged execution times. We present evidence that relying on the default Linux CFS scheduler increases serverless workloads cost by up to 10X. In this article, we raise awareness and make a case for rethinking the OS-level scheduling in Linux for serverless workloads composed of many short-lived processes. To make serverless more affordable we introduce a hybrid two-level scheduling approach that relies on FaaS characteristics. Short-running functions are executed in FIFO fashion without preemption, while longer-running functions are passed to CFS after a certain time period. We show that tailor-made OS scheduling is able to significantly reduce user-facing costs without adding any provider-facing overhead.

In Serverless, OS Scheduler Choice Costs Money: A Hybrid Scheduling Approach for Cheaper FaaS

TL;DR

This article raises awareness and makes a case for rethinking the OS-level scheduling in Linux for serverless workloads composed of many short-lived processes and introduces a hybrid two-level scheduling approach that relies on FaaS characteristics.

Abstract

In Function-as-a-Service (FaaS) serverless, large applications are split into short-lived stateless functions. Deploying functions is mutually profitable: users need not be concerned with resource management, while providers can keep their servers at high utilization rates running thousands of functions concurrently on a single machine. It is exactly this high concurrency that comes at a cost. The standard Linux Completely Fair Scheduler (CFS) switches often between tasks, which leads to prolonged execution times. We present evidence that relying on the default Linux CFS scheduler increases serverless workloads cost by up to 10X. In this article, we raise awareness and make a case for rethinking the OS-level scheduling in Linux for serverless workloads composed of many short-lived processes. To make serverless more affordable we introduce a hybrid two-level scheduling approach that relies on FaaS characteristics. Short-running functions are executed in FIFO fashion without preemption, while longer-running functions are passed to CFS after a certain time period. We show that tailor-made OS scheduling is able to significantly reduce user-facing costs without adding any provider-facing overhead.

Paper Structure

This paper contains 28 sections, 3 equations, 23 figures, 1 table.

Figures (23)

  • Figure 1: Cost for FIFO and CFS OS scheduling policies calculated using AWS Lambda pricing. The workload is using the first 12,442 functions in the Microsoft Azure trace. Although FIFO cost is significantly lower, it introduces unacceptably large latencies for functions that simply wait in queues. We explore this trade-off in the remainder of the article.
  • Figure 2: Left: Average duration distribution in a two-week period in Azure dataset. Right: Function arrival pattern on the first day in Azure dataset, noting the burstiness characteristic.
  • Figure 3: Metrics from ArpaciDusseau23-Book. Note that the more tasks accumulate in the global queue, the response time for each task tends to increase. Additionally, since the task is preempted, the execution time will be extended until the task is complete.
  • Figure 4: Metrics comparison between FIFO and CFS. FIFO policy achieves good execution time but sacrifices latency.
  • Figure 5: Metrics comparison between FIFO policy and FIFO policy with 100ms preemption. Preemption improves response time at the cost of increasing execution time.
  • ...and 18 more figures