Table of Contents
Fetching ...

Multi-Level ML Based Burst-Aware Autoscaling for SLO Assurance and Cost Efficiency

Chunyang Meng, Haogang Tong, Tianyang Wu, Maolin Pan, Yang Yu

TL;DR

BAScaler tackles the challenge of autoscaling under dynamic and bursty workloads to guarantee SLOs while reducing costs. It combines a prediction-based burst detector, an AR-bootstrapped burst overestimation, an SVR-based performance estimator for bursts, and a PPO-driven estimation enhancer to adapt resource provisioning in both bursty and non-bursting regimes. The system operates within a MAPE framework and integrates with Kubernetes, Istio, and Prometheus to monitor, predict, and adjust resources before demand spikes. Experimental results on ten real-world traces show substantial reductions in SLO violations and request errors, along with meaningful cost efficiency, validating the effectiveness of the multi-level ML approach. The work highlights the practicality of fine-grained, burst-aware autoscaling for containerized cloud services and provides a public implementation for broader use.

Abstract

Autoscaling is a technology to automatically scale the resources provided to their applications without human intervention to guarantee runtime Quality of Service (QoS) while saving costs. However, user-facing cloud applications serve dynamic workloads that often exhibit variable and contain bursts, posing challenges to autoscaling for maintaining QoS within Service-Level Objectives (SLOs). Conservative strategies risk over-provisioning, while aggressive ones may cause SLO violations, making it more challenging to design effective autoscaling. This paper introduces BAScaler, a Burst-Aware Autoscaling framework for containerized cloud services or applications under complex workloads, combining multi-level machine learning (ML) techniques to mitigate SLO violations while saving costs. BAScaler incorporates a novel prediction-based burst detection mechanism that distinguishes between predictable periodic workload spikes and actual bursts. When bursts are detected, BAScaler appropriately overestimates them and allocates resources accordingly to address the rapid growth in resource demand. On the other hand, BAScaler employs reinforcement learning to rectify potential inaccuracies in resource estimation, enabling more precise resource allocation during non-bursts. Experiments across ten real-world workloads demonstrate BAScaler's effectiveness, achieving a 57% average reduction in SLO violations and cutting resource costs by 10% compared to other prominent methods.

Multi-Level ML Based Burst-Aware Autoscaling for SLO Assurance and Cost Efficiency

TL;DR

BAScaler tackles the challenge of autoscaling under dynamic and bursty workloads to guarantee SLOs while reducing costs. It combines a prediction-based burst detector, an AR-bootstrapped burst overestimation, an SVR-based performance estimator for bursts, and a PPO-driven estimation enhancer to adapt resource provisioning in both bursty and non-bursting regimes. The system operates within a MAPE framework and integrates with Kubernetes, Istio, and Prometheus to monitor, predict, and adjust resources before demand spikes. Experimental results on ten real-world traces show substantial reductions in SLO violations and request errors, along with meaningful cost efficiency, validating the effectiveness of the multi-level ML approach. The work highlights the practicality of fine-grained, burst-aware autoscaling for containerized cloud services and provides a public implementation for broader use.

Abstract

Autoscaling is a technology to automatically scale the resources provided to their applications without human intervention to guarantee runtime Quality of Service (QoS) while saving costs. However, user-facing cloud applications serve dynamic workloads that often exhibit variable and contain bursts, posing challenges to autoscaling for maintaining QoS within Service-Level Objectives (SLOs). Conservative strategies risk over-provisioning, while aggressive ones may cause SLO violations, making it more challenging to design effective autoscaling. This paper introduces BAScaler, a Burst-Aware Autoscaling framework for containerized cloud services or applications under complex workloads, combining multi-level machine learning (ML) techniques to mitigate SLO violations while saving costs. BAScaler incorporates a novel prediction-based burst detection mechanism that distinguishes between predictable periodic workload spikes and actual bursts. When bursts are detected, BAScaler appropriately overestimates them and allocates resources accordingly to address the rapid growth in resource demand. On the other hand, BAScaler employs reinforcement learning to rectify potential inaccuracies in resource estimation, enabling more precise resource allocation during non-bursts. Experiments across ten real-world workloads demonstrate BAScaler's effectiveness, achieving a 57% average reduction in SLO violations and cutting resource costs by 10% compared to other prominent methods.
Paper Structure (18 sections, 22 equations, 9 figures, 4 tables, 2 algorithms)

This paper contains 18 sections, 22 equations, 9 figures, 4 tables, 2 algorithms.

Figures (9)

  • Figure 1: Typical autoscaling scenarios—right sizing of resources. (a) Increased requests lead to congestion, prompting the autoscaler to either scale out (adding VMs) or scale up (boosting resources in existing VMs). (b) Decreased requests trigger deprovisioning actions, such as scaling in (removing VMs) or scaling down (reducing resources in existing VMs). Source: This figure adapted from Chenhao Qu (2018)qu2018auto.
  • Figure 2: An example illustrates the comparison. The proposal will not identify periodic workload spikes as bursts. Other statistical methods primarily differ in statistical threshold $ST_w$. Here, the $ST_w=MA_w+0.6\times std_w$ presented in vlachos2004identifying, where $MA_w$ and $std_w$ represent the moving average and the standard deviation of the workload over the last $w$ time steps, respectively.
  • Figure 3: Overview of the BAScaler shown as a MAPE loop: (a) Cloud-based system that provides Monitoring and Execution; (b) Burst-aware autoscaler for Analysis and Planning includes Workloads Prediction, Burst Detection & Handling, Resource Estimation, and Estimation Enhancement.
  • Figure 4: Structure of Informer
  • Figure 5: Calculation process of $V^t$
  • ...and 4 more figures