Smart HPA: A Resource-Efficient Horizontal Pod Auto-scaler for Microservice Architectures
Hussain Ahmad, Christoph Treude, Markus Wagner, Claudia Szabo
TL;DR
Smart HPA addresses the limitations of traditional HPAs in resource-constrained microservice deployments by introducing a hierarchical auto-scaling framework that combines decentralized per-microservice managers with a centralized resource adapter activated only when needed. The approach leverages MAPE-K-inspired components and resource-efficient heuristics to exchange CPU resources among microservices, reducing overutilization and overprovisioning while eliminating underprovisioning. Empirical evaluation on an AWS-based Kubernetes cluster with the Online Boutique benchmark demonstrates substantial improvements over the Kubernetes baseline, including reduced resource waste and improved allocation efficiency. The work contributes a flexible architectural blueprint, actionable resource-balancing heuristics, and a replication package to support reproduction and extension, with practical implications for scalable, cost-effective microservice management.
Abstract
Microservice architectures have gained prominence in both academia and industry, offering enhanced agility, reusability, and scalability. To simplify scaling operations in microservice architectures, container orchestration platforms such as Kubernetes feature Horizontal Pod Auto-scalers (HPAs) designed to adjust the resources of microservices to accommodate fluctuating workloads. However, existing HPAs are not suitable for resource-constrained environments, as they make scaling decisions based on the individual resource capacities of microservices, leading to service unavailability and performance degradation. Furthermore, HPA architectures exhibit several issues, including inefficient data processing and a lack of coordinated scaling operations. To address these concerns, we propose Smart HPA, a flexible resource-efficient horizontal pod auto-scaler. It features a hierarchical architecture that integrates both centralized and decentralized architectural styles to leverage their respective strengths while addressing their limitations. We introduce resource-efficient heuristics that empower Smart HPA to exchange resources among microservices, facilitating effective auto-scaling of microservices in resource-constrained environments. Our experimental results show that Smart HPA outperforms the Kubernetes baseline HPA by reducing resource overutilization, overprovisioning, and underprovisioning while increasing resource allocation to microservice applications.
