Table of Contents
Fetching ...

Litmus: Fair Pricing for Serverless Computing

Qi Pei, Yipeng Wang, Seunghee Shin

TL;DR

The paper tackles unfair pricing in multi-tenant serverless platforms caused by congestion, proposing Litmus pricing, a lightweight, congestion-aware discount mechanism. It uses a Litmus test during function startup to quantify system congestion and decomposes cost into $P = P_{private} + P_{shared}$ with separate rates for private and shared resources, derived from $R = R_{base} * \frac{T_{solo}}{T_{congestion}}$ and measured via performance counters. By leveraging reference workloads and regression (with L3 misses as a supplemental signal), Litmus produces prices that closely track ideal discounts, achieving an average deviation of about $0.2\%$ in heavily congested environments. The approach maintains high resource utilization while offering tenants fair compensation for slowdown, and it demonstrates robustness across temporal sharing, CPU frequency variations, and different CPU architectures.

Abstract

Serverless computing has emerged as a market-dominant paradigm in modern cloud computing, benefiting both cloud providers and tenants. While service providers can optimize their machine utilization, tenants only need to pay for the resources they use. To maximize resource utilization, these serverless systems co-run numerous short-lived functions, bearing frequent system condition shifts. When the system gets overcrowded, a tenant's function may suffer from disturbing slowdowns. Ironically, tenants also incur higher costs during these slowdowns, as commercial serverless platforms determine costs proportional to their execution times. This paper argues that cloud providers should compensate tenants for losses incurred when the server is over-provisioned. However, estimating tenants' losses is challenging without pre-profiled information about their functions. Prior studies have indicated that assessing tenant losses leads to heavy overheads. As a solution, this paper introduces a new pricing model that offers discounts based on the machine's state while presuming the tenant's loss under that state. To monitor the machine state accurately, Litmus pricing frequently conducts Litmus tests, an effective and lightweight solution for measuring system congestion. Our experiments show that Litmus pricing can accurately gauge the impact of system congestion and offer nearly ideal prices, with only a 0.2% price difference on average, in a heavily congested system.

Litmus: Fair Pricing for Serverless Computing

TL;DR

The paper tackles unfair pricing in multi-tenant serverless platforms caused by congestion, proposing Litmus pricing, a lightweight, congestion-aware discount mechanism. It uses a Litmus test during function startup to quantify system congestion and decomposes cost into with separate rates for private and shared resources, derived from and measured via performance counters. By leveraging reference workloads and regression (with L3 misses as a supplemental signal), Litmus produces prices that closely track ideal discounts, achieving an average deviation of about in heavily congested environments. The approach maintains high resource utilization while offering tenants fair compensation for slowdown, and it demonstrates robustness across temporal sharing, CPU frequency variations, and different CPU architectures.

Abstract

Serverless computing has emerged as a market-dominant paradigm in modern cloud computing, benefiting both cloud providers and tenants. While service providers can optimize their machine utilization, tenants only need to pay for the resources they use. To maximize resource utilization, these serverless systems co-run numerous short-lived functions, bearing frequent system condition shifts. When the system gets overcrowded, a tenant's function may suffer from disturbing slowdowns. Ironically, tenants also incur higher costs during these slowdowns, as commercial serverless platforms determine costs proportional to their execution times. This paper argues that cloud providers should compensate tenants for losses incurred when the server is over-provisioned. However, estimating tenants' losses is challenging without pre-profiled information about their functions. Prior studies have indicated that assessing tenant losses leads to heavy overheads. As a solution, this paper introduces a new pricing model that offers discounts based on the machine's state while presuming the tenant's loss under that state. To monitor the machine state accurately, Litmus pricing frequently conducts Litmus tests, an effective and lightweight solution for measuring system congestion. Our experiments show that Litmus pricing can accurately gauge the impact of system congestion and offer nearly ideal prices, with only a 0.2% price difference on average, in a heavily congested system.
Paper Structure (15 sections, 3 equations, 21 figures, 1 table)

This paper contains 15 sections, 3 equations, 21 figures, 1 table.

Figures (21)

  • Figure 1: (a) L2 misses and (b) L3 misses of traffic generators, both normalized with the average L2 and L3 misses of serverless applications listed in Table \ref{['tb:benchmarks']}
  • Figure 2: Execution time of applications that run with 26 others, normalized to the execution time when running alone
  • Figure 3: $T_{private}$ and $T_{shared}$ of applications that run with 26 others, normalized to those when running alone
  • Figure 4: Execution time distribution of $T_{private}$ and $T_{shared}$
  • Figure 5: Congestion and performance tables: numbers in both tables indicate the slowdowns of startup codes and reference functions, collected with CT-Gen and MB-Gen
  • ...and 16 more figures