Table of Contents
Fetching ...

Shaved Ice: Optimal Compute Resource Commitments for Dynamic Multi-Cloud Workloads

Murray Stokely, Neel Nadgir, Jack Peele, Orestis Kostakis

TL;DR

The paper addresses how large multi-cloud services can balance cost savings from long-term compute commitments with the risk of demand changes, using a three-year Snowflake demand trace. It identifies demand drivers from user workloads, hardware evolution, and software performance, and documents pronounced diurnal, weekly, and seasonal patterns along with hardware/software-induced performance shifts. Two core optimizations are developed: (i) selecting an optimal commitment level $c$ by minimizing $C(c)=\int_0^T A\cdot\max(0,f(x)-c)\,dx + \int_0^T B\cdot\max(0,c-f(x))\,dx$ using Brent's method, and (ii) minimizing the size of a pre-provisioned free pool via predictive pre-provisioning and laddered commitment strategies, while supporting time shifting of deferrable workloads. The work is complemented by a public dataset release and demonstrates practical cost reductions, offering a roadmap for operationalizing capacity planning and demand-shaping techniques in real multi-cloud environments.

Abstract

Cloud providers have introduced pricing models to incentivize long-term commitments of compute capacity. These long-term commitments allow the cloud providers to get guaranteed revenue for their investments in data centers and computing infrastructure. However, these commitments expose cloud customers to demand risk if expected future demand does not materialize. While there are existing studies of theoretical techniques for optimizing performance, latency, and cost, relatively little has been reported so far on the trade-offs between cost savings and demand risk for compute commitments for large-scale cloud services. We characterize cloud compute demand based on an extensive three year study of the Snowflake Data Cloud, which includes data warehousing, data lakes, data science, data engineering, and other workloads across multiple clouds. We quantify capacity demand drivers from user workloads, hardware generational improvements, and software performance improvements. Using this data, we formulate a series of practical optimizations that maximize capacity availability and minimize costs for the cloud customer.

Shaved Ice: Optimal Compute Resource Commitments for Dynamic Multi-Cloud Workloads

TL;DR

The paper addresses how large multi-cloud services can balance cost savings from long-term compute commitments with the risk of demand changes, using a three-year Snowflake demand trace. It identifies demand drivers from user workloads, hardware evolution, and software performance, and documents pronounced diurnal, weekly, and seasonal patterns along with hardware/software-induced performance shifts. Two core optimizations are developed: (i) selecting an optimal commitment level by minimizing using Brent's method, and (ii) minimizing the size of a pre-provisioned free pool via predictive pre-provisioning and laddered commitment strategies, while supporting time shifting of deferrable workloads. The work is complemented by a public dataset release and demonstrates practical cost reductions, offering a roadmap for operationalizing capacity planning and demand-shaping techniques in real multi-cloud environments.

Abstract

Cloud providers have introduced pricing models to incentivize long-term commitments of compute capacity. These long-term commitments allow the cloud providers to get guaranteed revenue for their investments in data centers and computing infrastructure. However, these commitments expose cloud customers to demand risk if expected future demand does not materialize. While there are existing studies of theoretical techniques for optimizing performance, latency, and cost, relatively little has been reported so far on the trade-offs between cost savings and demand risk for compute commitments for large-scale cloud services. We characterize cloud compute demand based on an extensive three year study of the Snowflake Data Cloud, which includes data warehousing, data lakes, data science, data engineering, and other workloads across multiple clouds. We quantify capacity demand drivers from user workloads, hardware generational improvements, and software performance improvements. Using this data, we formulate a series of practical optimizations that maximize capacity availability and minimize costs for the cloud customer.

Paper Structure

This paper contains 24 sections, 5 equations, 14 figures, 3 tables, 1 algorithm.

Figures (14)

  • Figure 1: Multi-Cluster, Shared Data Architecture of Snowflake (left) Mapped to Individual Resource Pools of Different Shaped Compute VMs on Different Infrastructure-as-a-Service (IaaS) Cloud Providers (right).
  • Figure 2: (A) Daily demand of VM instances in a Snowflake compute resource pool over 3 years showing significant seasonality with an end of year drop in demand, and (B) Median hourly VM instance demand over a week showing significant daily and weekly periodicity. The confidence interval describes the 95th percentile for all weeks in the full 3-year history.
  • Figure 3: Example demand curve with fixed capacity commitment showing area of demand exceeding commitment level (blue) and area of unused commitment (red hatches).
  • Figure 4: Visualization of the numeric approximation method for computing the commitment-level $c$ that has minimal cost, $C(c)$, for an empirical demand curve with daily and weekly periodicity and weights $A = 2.1, B=1.0$.
  • Figure 5: The week over week growth rate of aggregate VM demand in the dataset shows a significant number of negative weeks, including a clear seasonal pattern, despite a high annual growth rate.
  • ...and 9 more figures