Table of Contents
Fetching ...

Melding the Serverless Control Plane with the Conventional Cluster Manager for Speed and Resource Efficiency

Leonid Kondrashov, Lazar Cvetković, Hancheng Wang, Boxi Zhou, Dmitrii Ustiugov

TL;DR

PulseNet tackles the core serverless challenge of achieving low-latency scaling without sacrificing compatibility with conventional cluster managers. It introduces a dual-track control plane: a standard track using a mature cluster manager for sustainable traffic, and an expedited track with node-local agents that rapidly instantiate disposable Emergency Instances for bursts, bypassing the main cluster manager. Evaluation on production-like traces shows PulseNet delivers 1.5–3.5x performance improvements over Kubernetes-compatible systems and up to 70% cost reductions, while maintaining compatibility and reducing memory waste compared to FaaS-specialized designs. The work demonstrates that by exploiting bimodal traffic characteristics and separating fast burst handling from steady-state management, serverless platforms can achieve high performance and efficiency at scale, enabling tighter co-location of FaaS and BaaS components within a single cluster.

Abstract

Serverless platforms face a trade-off: conventional cluster managers like Kubernetes offer compatibility for co-locating Function-as-a-Service (FaaS) and Backend-as-a-Service (BaaS) components of serverless applications, at the cost of high cold-start latency, whereas specialized FaaS-only systems like Dirigent achieve low latency by sacrificing compatibility, preventing integrated management and optimization. Our analysis reveals that FaaS traffic is bimodal: predictable, sustainable traffic consumes >98% of cluster resources, whereas sporadic, excessive bursts stress the control plane's scaling latency, not its throughput. With these insights, we design PulseNet, a serverless architecture that uses a dual-track control plane tailored to both traffic types. PulseNet's standard track manages sustainable traffic with long-lived, full-featured Regular Instances under a conventional cluster manager, preserving compatibility for the majority of the workload. To handle excessive traffic, an expedited track bypasses the slow manager to rapidly create short-lived, disposable Emergency Instances, minimizing cold-start latency and resource waste from idle instances. This hybrid approach achieves 35% better performance than Dirigent, a FaaS-only system, on a production workload at the same cost and outperforms other Kubernetes-compatible systems by 1.5-3.5x, reducing the cost by up to 70%.

Melding the Serverless Control Plane with the Conventional Cluster Manager for Speed and Resource Efficiency

TL;DR

PulseNet tackles the core serverless challenge of achieving low-latency scaling without sacrificing compatibility with conventional cluster managers. It introduces a dual-track control plane: a standard track using a mature cluster manager for sustainable traffic, and an expedited track with node-local agents that rapidly instantiate disposable Emergency Instances for bursts, bypassing the main cluster manager. Evaluation on production-like traces shows PulseNet delivers 1.5–3.5x performance improvements over Kubernetes-compatible systems and up to 70% cost reductions, while maintaining compatibility and reducing memory waste compared to FaaS-specialized designs. The work demonstrates that by exploiting bimodal traffic characteristics and separating fast burst handling from steady-state management, serverless platforms can achieve high performance and efficiency at scale, enabling tighter co-location of FaaS and BaaS components within a single cluster.

Abstract

Serverless platforms face a trade-off: conventional cluster managers like Kubernetes offer compatibility for co-locating Function-as-a-Service (FaaS) and Backend-as-a-Service (BaaS) components of serverless applications, at the cost of high cold-start latency, whereas specialized FaaS-only systems like Dirigent achieve low latency by sacrificing compatibility, preventing integrated management and optimization. Our analysis reveals that FaaS traffic is bimodal: predictable, sustainable traffic consumes >98% of cluster resources, whereas sporadic, excessive bursts stress the control plane's scaling latency, not its throughput. With these insights, we design PulseNet, a serverless architecture that uses a dual-track control plane tailored to both traffic types. PulseNet's standard track manages sustainable traffic with long-lived, full-featured Regular Instances under a conventional cluster manager, preserving compatibility for the majority of the workload. To handle excessive traffic, an expedited track bypasses the slow manager to rapidly create short-lived, disposable Emergency Instances, minimizing cold-start latency and resource waste from idle instances. This hybrid approach achieves 35% better performance than Dirigent, a FaaS-only system, on a production workload at the same cost and outperforms other Kubernetes-compatible systems by 1.5-3.5x, reducing the cost by up to 70%.

Paper Structure

This paper contains 33 sections, 13 figures, 1 table.

Figures (13)

  • Figure 1: Instance number scaling over time in response to the changes in the in-flight request concurrency in the state-of-the-art systems and PulseNet. Kn scales too slowly, whereas Kn-Sync incurs high costs by keeping instances idle for prolonged periods. Dirigent's behavior is similar to Kn (not shown).
  • Figure 2: High-level overview of serverless architecture. The cluster manager is on the path only for cold invocations.
  • Figure 3: Cumulative distribution functions (CDFs) for the components of delays occurring in the systems with synchronous (Kn-Sync) and asynchronous (Kn) control planes.
  • Figure 4: Instance creation time breakdown in Knative.
  • Figure 5: The delays occurring in the Knative control plane under various instance-creation rates, measured with a microbenchmark. The red and black lines show the required instance creation rates at the 50-th and 99-th percentiles, respectively, when replaying invocations from a sampled production trace in simulated synchronous (Kn-Sync) and asynchronous (Kn) control planes.
  • ...and 8 more figures