A Decentralized Microservice Scheduling Approach Using Service Mesh in Cloud-Edge Systems
Yangyang Wen, Paul Townend, Per-Olov Östberg, Abel Souza, Clément Courageux-Sudan
TL;DR
This paper tackles the scalability and latency challenges of scheduling in cloud‑edge microservice ecosystems by proposing a fully decentralized scheduling framework that embeds autonomous sidecar logic in service mesh environments. It contrasts a MILP-based centralized scheduler with a hop-by-hop, sidecar-driven decentralized scheduler, using SimGrid to model realistic network and compute behavior. Through simulations, the authors provide initial evidence that decentralized scheduling can achieve lower makespans under high concurrency and reduced coordination overhead, while highlighting challenges in load balancing, observability, and policy alignment. The work lays out a system-level architectural direction and outlines future validation on real Kubernetes clusters, with exploration of resilience, cost, and hybrid coordination strategies. Overall, the results suggest that service-mesh-inspired decentralized scheduling offers promising scalability and fault tolerance for cloud‑edge microservices, albeit with nontrivial operational considerations.
Abstract
As microservice-based systems scale across the cloud-edge continuum, traditional centralized scheduling mechanisms increasingly struggle with latency, coordination overhead, and fault tolerance. This paper presents a new architectural direction: leveraging service mesh sidecar proxies as decentralized, in-situ schedulers to enable scalable, low-latency coordination in large-scale, cloud-native environments. We propose embedding lightweight, autonomous scheduling logic into each sidecar, allowing scheduling decisions to be made locally without centralized control. This approach leverages the growing maturity of service mesh infrastructures, which support programmable distributed traffic management. We describe the design of such an architecture and present initial results demonstrating its scalability potential in terms of response time and latency under varying request rates. Rather than delivering a finalized scheduling algorithm, this paper presents a system-level architectural direction and preliminary evidence to support its scalability potential.
