Table of Contents
Fetching ...

It Takes Two to Tango: Serverless Workflow Serving via Bilaterally Engaged Resource Adaptation

Jing Wu, Lin Wang, Quanfeng Deng, Chen Yu, Dong Zhang, Bingheng Yan, Fangming Liu

TL;DR

This work addresses resource inefficiency in serverless workflows caused by early-binding sizing that assumes worst-case runtimes. It introduces Janus, a bilaterally engaged, late-binding framework where developers provide compact hints that guide runtime resource adaptation by the provider through three components: profiler, synthesizer, and adapter. The approach yields substantial resource savings while maintaining end-to-end SLOs, with experiments on two real-world workflows showing notable improvements over state-of-the-art baselines and manageable online overhead. The findings suggest that per-workflow, runtime-aware adaptation driven by developer-informed hints can drastically reduce waste in serverless platforms without compromising latency guarantees, offering a practical path to more efficient serverless execution.

Abstract

Serverless platforms typically adopt an early-binding approach for function sizing, requiring developers to specify an immutable size for each function within a workflow beforehand. Accounting for potential runtime variability, developers must size functions for worst-case scenarios to ensure service-level objectives (SLOs), resulting in significant resource inefficiency. To address this issue, we propose Janus, a novel resource adaptation framework for serverless platforms. Janus employs a late-binding approach, allowing function sizes to be dynamically adapted based on runtime conditions. The main challenge lies in the information barrier between the developer and the provider: developers lack access to runtime information, while providers lack domain knowledge about the workflow. To bridge this gap, Janus allows developers to provide hints containing rules and options for resource adaptation. Providers then follow these hints to dynamically adjust resource allocation at runtime based on real-time function execution information, ensuring compliance with SLOs. We implement Janus and conduct extensive experiments with real-world serverless workflows. Our results demonstrate that Janus enhances resource efficiency by up to 34.7% compared to the state-of-the-art.

It Takes Two to Tango: Serverless Workflow Serving via Bilaterally Engaged Resource Adaptation

TL;DR

This work addresses resource inefficiency in serverless workflows caused by early-binding sizing that assumes worst-case runtimes. It introduces Janus, a bilaterally engaged, late-binding framework where developers provide compact hints that guide runtime resource adaptation by the provider through three components: profiler, synthesizer, and adapter. The approach yields substantial resource savings while maintaining end-to-end SLOs, with experiments on two real-world workflows showing notable improvements over state-of-the-art baselines and manageable online overhead. The findings suggest that per-workflow, runtime-aware adaptation driven by developer-informed hints can drastically reduce waste in serverless platforms without compromising latency guarantees, offering a practical path to more efficient serverless execution.

Abstract

Serverless platforms typically adopt an early-binding approach for function sizing, requiring developers to specify an immutable size for each function within a workflow beforehand. Accounting for potential runtime variability, developers must size functions for worst-case scenarios to ensure service-level objectives (SLOs), resulting in significant resource inefficiency. To address this issue, we propose Janus, a novel resource adaptation framework for serverless platforms. Janus employs a late-binding approach, allowing function sizes to be dynamically adapted based on runtime conditions. The main challenge lies in the information barrier between the developer and the provider: developers lack access to runtime information, while providers lack domain knowledge about the workflow. To bridge this gap, Janus allows developers to provide hints containing rules and options for resource adaptation. Providers then follow these hints to dynamically adjust resource allocation at runtime based on real-time function execution information, ensuring compliance with SLOs. We implement Janus and conduct extensive experiments with real-world serverless workflows. Our results demonstrate that Janus enhances resource efficiency by up to 34.7% compared to the state-of-the-art.

Paper Structure

This paper contains 26 sections, 4 equations, 9 figures, 2 tables, 2 algorithms.

Figures (9)

  • Figure 1: (a) slacks of function invocations in production traces, (b) function latency variance caused by varying input worksets for functions object detection (OD), question answering (QA), and and text-to-speech (TS), respectively, (c) performance interference attributed to co-location of homogeneous function with different dominant resource demands.
  • Figure 2: Performance comparison between early-binding (left) eurosys19-grandslam and late-binding (runtime resource adaptation), where the CPU consumption (right) is normalized by the optimal obtained with exhaustive search.
  • Figure 3: An overview of the system architecture of Janus. The proposed runtime resource adaptation framework bilaterally engages the application developer and the serverless platform provider, where the developer is responsible for the offline part while the provider is responsible for the online part.
  • Figure 4: End-to-end latency distribution of IA under the concurrency (i.e., batch size) as one, two and three respectively, with different SLOs (red dashed line). The concurrency of VA is limited to one due to its non-batchable functions (i.e., FE and ICO).
  • Figure 5: Resource consumption of (a) IA (left) and VA (right) under the concurrency as one, respectively, and of (b) IA under the concurrency as two (left) and three (right), respectively.
  • ...and 4 more figures