Litmus: Fair Pricing for Serverless Computing
Qi Pei, Yipeng Wang, Seunghee Shin
TL;DR
The paper tackles unfair pricing in multi-tenant serverless platforms caused by congestion, proposing Litmus pricing, a lightweight, congestion-aware discount mechanism. It uses a Litmus test during function startup to quantify system congestion and decomposes cost into $P = P_{private} + P_{shared}$ with separate rates for private and shared resources, derived from $R = R_{base} * \frac{T_{solo}}{T_{congestion}}$ and measured via performance counters. By leveraging reference workloads and regression (with L3 misses as a supplemental signal), Litmus produces prices that closely track ideal discounts, achieving an average deviation of about $0.2\%$ in heavily congested environments. The approach maintains high resource utilization while offering tenants fair compensation for slowdown, and it demonstrates robustness across temporal sharing, CPU frequency variations, and different CPU architectures.
Abstract
Serverless computing has emerged as a market-dominant paradigm in modern cloud computing, benefiting both cloud providers and tenants. While service providers can optimize their machine utilization, tenants only need to pay for the resources they use. To maximize resource utilization, these serverless systems co-run numerous short-lived functions, bearing frequent system condition shifts. When the system gets overcrowded, a tenant's function may suffer from disturbing slowdowns. Ironically, tenants also incur higher costs during these slowdowns, as commercial serverless platforms determine costs proportional to their execution times. This paper argues that cloud providers should compensate tenants for losses incurred when the server is over-provisioned. However, estimating tenants' losses is challenging without pre-profiled information about their functions. Prior studies have indicated that assessing tenant losses leads to heavy overheads. As a solution, this paper introduces a new pricing model that offers discounts based on the machine's state while presuming the tenant's loss under that state. To monitor the machine state accurately, Litmus pricing frequently conducts Litmus tests, an effective and lightweight solution for measuring system congestion. Our experiments show that Litmus pricing can accurately gauge the impact of system congestion and offer nearly ideal prices, with only a 0.2% price difference on average, in a heavily congested system.
