Proactive and Reactive Autoscaling Techniques for Edge Computing
Suhrid Gupta, Muhammed Tawfiqul Islam, Rajkumar Buyya
TL;DR
Edge computing demands strict SLA compliance for latency-sensitive microservices, motivating autoscaling beyond cloud defaults. The paper surveys reactive, proactive, and hybrid autoscaling methods and presents a Kubernetes-integrated hybrid autoscaler evaluated on a six-node edge-cloud setup with DeathStarBench workloads, showing significant SLA and latency improvements over baseline approaches. Key findings include the ability of hybrid strategies to mitigate cold-start latency and maintain sub-150 ms response times, with up to ~150% improvements in some metrics. The work highlights practical implications for edge deployments and outlines future directions, including multi-variate forecasting and multi-SLA handling to broaden applicability in real-world systems.
Abstract
Edge computing allows for the decentralization of computing resources. This decentralization is achieved through implementing microservice architectures, which require low latencies to meet stringent service level agreements (SLA) such as performance, reliability, and availability metrics. While cloud computing offers the large data storage and computation resources necessary to handle peak demands, a hybrid cloud and edge environment is required to ensure SLA compliance. Several auto-scaling algorithms have been proposed to try to achieve these compliance challenges, but they suffer from performance issues and configuration complexity. This chapter provides a brief overview of edge computing architecture, its uses, benefits, and challenges for resource scaling. We then introduce Service Level Agreements, and existing research on devising algorithms used in edge computing environments to meet these agreements, along with their benefits and drawbacks.
