PBScaler: A Bottleneck-aware Autoscaling Framework for Microservice-based Applications
Shuaiyu Xie, Jian Wang, Bing Li, Zekun Zhang, Duantengchuan Li, Patrick C. K. H
TL;DR
PBScaler addresses the challenge of autoscaling in microservice architectures by focusing on bottleneck localization to prevent performance degradation. It introduces TopoRank, a topological-potential based random walk that combines anomaly potential with Personalized PageRank to identify PBs from a microservice dependency graph, followed by offline GA-based scaling optimization guided by an SLO-violation predictor. The approach reduces unnecessary scaling, limits oscillations, and improves end-to-end performance while cutting resource consumption, as demonstrated on real-world and emulated workloads. The combination of metric-based observability, offline optimization, and bottleneck-centric decisions provides a practical, scalable solution for cloud-native microservices.
Abstract
Autoscaling is critical for ensuring optimal performance and resource utilization in cloud applications with dynamic workloads. However, traditional autoscaling technologies are typically no longer applicable in microservice-based applications due to the diverse workload patterns and complex interactions between microservices. Specifically, the propagation of performance anomalies through interactions leads to a high number of abnormal microservices, making it difficult to identify the root performance bottlenecks (PBs) and formulate appropriate scaling strategies. In addition, to balance resource consumption and performance, the existing mainstream approaches based on online optimization algorithms require multiple iterations, leading to oscillation and elevating the likelihood of performance degradation. To tackle these issues, we propose PBScaler, a bottleneck-aware autoscaling framework designed to prevent performance degradation in a microservice-based application. The key insight of PBScaler is to locate the PBs. Thus, we propose TopoRank, a novel random walk algorithm based on the topological potential to reduce unnecessary scaling. By integrating TopoRank with an offline performance-aware optimization algorithm, PBScaler optimizes replica management without disrupting the online application. Comprehensive experiments demonstrate that PBScaler outperforms existing state-of-the-art approaches in mitigating performance issues while conserving resources efficiently.
