Table of Contents
Fetching ...

A Decentralized Microservice Scheduling Approach Using Service Mesh in Cloud-Edge Systems

Yangyang Wen, Paul Townend, Per-Olov Östberg, Abel Souza, Clément Courageux-Sudan

TL;DR

This paper tackles the scalability and latency challenges of scheduling in cloud‑edge microservice ecosystems by proposing a fully decentralized scheduling framework that embeds autonomous sidecar logic in service mesh environments. It contrasts a MILP-based centralized scheduler with a hop-by-hop, sidecar-driven decentralized scheduler, using SimGrid to model realistic network and compute behavior. Through simulations, the authors provide initial evidence that decentralized scheduling can achieve lower makespans under high concurrency and reduced coordination overhead, while highlighting challenges in load balancing, observability, and policy alignment. The work lays out a system-level architectural direction and outlines future validation on real Kubernetes clusters, with exploration of resilience, cost, and hybrid coordination strategies. Overall, the results suggest that service-mesh-inspired decentralized scheduling offers promising scalability and fault tolerance for cloud‑edge microservices, albeit with nontrivial operational considerations.

Abstract

As microservice-based systems scale across the cloud-edge continuum, traditional centralized scheduling mechanisms increasingly struggle with latency, coordination overhead, and fault tolerance. This paper presents a new architectural direction: leveraging service mesh sidecar proxies as decentralized, in-situ schedulers to enable scalable, low-latency coordination in large-scale, cloud-native environments. We propose embedding lightweight, autonomous scheduling logic into each sidecar, allowing scheduling decisions to be made locally without centralized control. This approach leverages the growing maturity of service mesh infrastructures, which support programmable distributed traffic management. We describe the design of such an architecture and present initial results demonstrating its scalability potential in terms of response time and latency under varying request rates. Rather than delivering a finalized scheduling algorithm, this paper presents a system-level architectural direction and preliminary evidence to support its scalability potential.

A Decentralized Microservice Scheduling Approach Using Service Mesh in Cloud-Edge Systems

TL;DR

This paper tackles the scalability and latency challenges of scheduling in cloud‑edge microservice ecosystems by proposing a fully decentralized scheduling framework that embeds autonomous sidecar logic in service mesh environments. It contrasts a MILP-based centralized scheduler with a hop-by-hop, sidecar-driven decentralized scheduler, using SimGrid to model realistic network and compute behavior. Through simulations, the authors provide initial evidence that decentralized scheduling can achieve lower makespans under high concurrency and reduced coordination overhead, while highlighting challenges in load balancing, observability, and policy alignment. The work lays out a system-level architectural direction and outlines future validation on real Kubernetes clusters, with exploration of resilience, cost, and hybrid coordination strategies. Overall, the results suggest that service-mesh-inspired decentralized scheduling offers promising scalability and fault tolerance for cloud‑edge microservices, albeit with nontrivial operational considerations.

Abstract

As microservice-based systems scale across the cloud-edge continuum, traditional centralized scheduling mechanisms increasingly struggle with latency, coordination overhead, and fault tolerance. This paper presents a new architectural direction: leveraging service mesh sidecar proxies as decentralized, in-situ schedulers to enable scalable, low-latency coordination in large-scale, cloud-native environments. We propose embedding lightweight, autonomous scheduling logic into each sidecar, allowing scheduling decisions to be made locally without centralized control. This approach leverages the growing maturity of service mesh infrastructures, which support programmable distributed traffic management. We describe the design of such an architecture and present initial results demonstrating its scalability potential in terms of response time and latency under varying request rates. Rather than delivering a finalized scheduling algorithm, this paper presents a system-level architectural direction and preliminary evidence to support its scalability potential.

Paper Structure

This paper contains 38 sections, 8 equations, 4 figures, 6 tables, 2 algorithms.

Figures (4)

  • Figure 1: Chain-based Execution Model: Hop-by-hop decentralized scheduling driven by sidecar agents using locally cached global metadata.
  • Figure 2: System Architecture: Geo-distributed regions host replicated services. Each actor instance is paired with a sidecar that performs local scheduling using metadata from a shared, eventually-consistent distributed metadata store.
  • Figure 3: 3D surface plots of scheduler wallclock time as a function of service chain length and replica count. Centralized scheduling incurs steep growth in computational demand, while decentralized scheduling remains more scalable.
  • Figure 4: Indicative Simulation Results Showing Makespan Comparison between Centralized and Decentralized Scheduling Algorithms