Table of Contents
Fetching ...

STaleX: A Spatiotemporal-Aware Adaptive Auto-scaling Framework for Microservices

Majid Dashtbani, Ladan Tahvildari

TL;DR

The paper addresses auto-scaling for microservices by incorporating spatiotemporal characteristics to maintain SLOs across service chains. It introduces STaleX, a two-layer framework with a Global Supervisory Component and per-service SpatioTemporal PID (STPID) controllers, integrating spatial dependencies and temporal workload forecasts (via LSTM trained on WorldCup98) to adapt resources. Key contributions include formal problem formulation, adaptive service weighting, and experimental validation on the SockShop benchmark showing significant resource savings (approximately 27%) with SLO performance comparable to or better than Kubernetes HPA. The work demonstrates a practical path toward efficient, dependency-aware auto-scaling in dynamic cloud environments.

Abstract

While cloud environments and auto-scaling solutions have been widely applied to traditional monolithic applications, they face significant limitations when it comes to microservices-based architectures. Microservices introduce additional challenges due to their dynamic and spatiotemporal characteristics, which require more efficient and specialized auto-scaling strategies. Centralized auto-scaling for the entire microservice application is insufficient, as each service within a chain has distinct specifications and performance requirements. Therefore, each service requires its own dedicated auto-scaler to address its unique scaling needs effectively, while also considering the dependencies with other services in the chain and the overall application. This paper presents a combination of control theory, machine learning, and heuristics to address these challenges. We propose an adaptive auto-scaling framework, STaleX, for microservices that integrates spatiotemporal features, enabling real-time resource adjustments to minimize SLO violations. STaleX employs a set of weighted Proportional-Integral-Derivative (PID) controllers for each service, where weights are dynamically adjusted based on a supervisory unit that integrates spatiotemporal features. This supervisory unit continuously monitors and adjusts both the weights and the resources allocated to each service. Our framework accounts for spatial features, including service specifications and dependencies among services, as well as temporal variations in workload, ensuring that resource allocation is continuously optimized. Through experiments on a microservice-based demo application deployed on a Kubernetes cluster, we demonstrate the effectiveness of our framework in improving performance and reducing costs compared to traditional scaling methods like Kubernetes Horizontal Pod Autoscaler (HPA) with a 26.9% reduction in resource usage.

STaleX: A Spatiotemporal-Aware Adaptive Auto-scaling Framework for Microservices

TL;DR

The paper addresses auto-scaling for microservices by incorporating spatiotemporal characteristics to maintain SLOs across service chains. It introduces STaleX, a two-layer framework with a Global Supervisory Component and per-service SpatioTemporal PID (STPID) controllers, integrating spatial dependencies and temporal workload forecasts (via LSTM trained on WorldCup98) to adapt resources. Key contributions include formal problem formulation, adaptive service weighting, and experimental validation on the SockShop benchmark showing significant resource savings (approximately 27%) with SLO performance comparable to or better than Kubernetes HPA. The work demonstrates a practical path toward efficient, dependency-aware auto-scaling in dynamic cloud environments.

Abstract

While cloud environments and auto-scaling solutions have been widely applied to traditional monolithic applications, they face significant limitations when it comes to microservices-based architectures. Microservices introduce additional challenges due to their dynamic and spatiotemporal characteristics, which require more efficient and specialized auto-scaling strategies. Centralized auto-scaling for the entire microservice application is insufficient, as each service within a chain has distinct specifications and performance requirements. Therefore, each service requires its own dedicated auto-scaler to address its unique scaling needs effectively, while also considering the dependencies with other services in the chain and the overall application. This paper presents a combination of control theory, machine learning, and heuristics to address these challenges. We propose an adaptive auto-scaling framework, STaleX, for microservices that integrates spatiotemporal features, enabling real-time resource adjustments to minimize SLO violations. STaleX employs a set of weighted Proportional-Integral-Derivative (PID) controllers for each service, where weights are dynamically adjusted based on a supervisory unit that integrates spatiotemporal features. This supervisory unit continuously monitors and adjusts both the weights and the resources allocated to each service. Our framework accounts for spatial features, including service specifications and dependencies among services, as well as temporal variations in workload, ensuring that resource allocation is continuously optimized. Through experiments on a microservice-based demo application deployed on a Kubernetes cluster, we demonstrate the effectiveness of our framework in improving performance and reducing costs compared to traditional scaling methods like Kubernetes Horizontal Pod Autoscaler (HPA) with a 26.9% reduction in resource usage.

Paper Structure

This paper contains 24 sections, 2 equations, 3 figures, 2 tables.

Figures (3)

  • Figure 1: A) Uniform Scaling - Response Time, B) Uniform Scaling - Resource Usage, C) Differentiated Scaling - Response Time, D) Differentiated Scaling - Resource Usage
  • Figure 2: STaleX framework: Green boxes represent the framework inputs, and blue boxes illustrate the framework components
  • Figure 3: A) RQ1: PID vs WPID vs SPID - Response Time B) RQ1: PID vs WPID vs SPID - Resource Usage, C) RQ2: HPA vs SPID vs STPID - Response Time D) RQ2: HPA vs SPID vs STPID - Resource Usage