Auto-scaling Approaches for Microservice Applications: A Survey and Taxonomy
Minxian Xu, Junhan Liao, Linfeng Wen, Huaming Wu, Kejiang Ye, Rajkumar Buyya, Chengzhong Xu
TL;DR
This paper surveys state-of-the-art auto-scaling approaches for microservice applications since 2018 and presents a taxonomy along five dimensions: infrastructure, architecture, scaling methods, optimization objectives, and behavior modeling, aiming to balance system optimization with SLA compliance.
Abstract
Microservice applications are created as loosely coupled application components and they leverage cloud elasticity to reduce costs and increase development speed. However, microservice applications exhibit complex interactions among dynamically evolving services and highly variable workloads, posing significant challenges to auto-scaling mechanisms. Key issues include service dependency management, performance profiling, anomaly detection, workload characterization, and fine-grained resource allocation. To address these challenges, recent auto-scaling approaches leverage historical and runtime data to adapt resource provisioning and optimize system efficiency. Since 2018, marked by the graduation of Kubernetes as the first Cloud Native Computing Foundation (CNCF) project, microservice applications have been widely deployed on standardized orchestration platforms, fundamentally shifting auto-scaling from coarse-grained to service-level, dependency-aware strategies. Accordingly, this paper surveys state-of-the-art auto-scaling approaches for microservice applications since 2018 and presents a taxonomy along five dimensions: infrastructure, architecture, scaling methods, optimization objectives, and behavior modeling. These perspectives collectively target key objectives, including resource efficiency, cost efficiency, and Service Level Agreement (SLA) assurance, aiming to balance system optimization with SLA compliance. We further present a comprehensive comparison and in-depth analysis of representative approaches, examining their core features, strengths, limitations, and applicable scenarios, as well as their performance across diverse environments and workload conditions.
