Table of Contents
Fetching ...

Q-GARS: Quantum-inspired Robust Microservice Chaining Scheduling

Huixiang Zhang, Mahzabeen Emu

Abstract

Microservice-based applications are characterized by stochastic latencies arising from long-tail execution patterns and heterogeneous resource constraints across computational nodes. To address this challenge, we first formulate the problem using Quadratic Unconstrained Binary Optimization (QUBO), which aligns the problem with emerging quantum-optimization paradigms. Building upon this, we propose Q-GARS (Quantum-Guided Adaptive Robust Scheduling), a hybrid framework that integrates the QUBO model with Simulated Quantum Annealing (SQA) based combinatorial search and online rescheduling mechanisms, enabling global microservice rank generation and real-time robust adjustment. We treat the SQA-produced rank as a soft prior, and update a closed-loop trust weight to adaptively switch and mix between this prior and a robust proportional-fairness allocator, maintaining robustness under prediction failures and runtime disturbances. Simulation results demonstrate that Q-GARS achieves an average weighted completion time improvement of 2.1\% relative to a greedy baseline of the remaining shortest processing-time (SRPT), with performance gains reaching up to 16.8\% in heavy-tailed latency. The adaptive mechanism reduces tail latency under high-variance conditions. In addition, Q-GARS achieves a mean node resource utilization rate of 0.817, which is 1.1 percentage points above the robust baseline (0.806).

Q-GARS: Quantum-inspired Robust Microservice Chaining Scheduling

Abstract

Microservice-based applications are characterized by stochastic latencies arising from long-tail execution patterns and heterogeneous resource constraints across computational nodes. To address this challenge, we first formulate the problem using Quadratic Unconstrained Binary Optimization (QUBO), which aligns the problem with emerging quantum-optimization paradigms. Building upon this, we propose Q-GARS (Quantum-Guided Adaptive Robust Scheduling), a hybrid framework that integrates the QUBO model with Simulated Quantum Annealing (SQA) based combinatorial search and online rescheduling mechanisms, enabling global microservice rank generation and real-time robust adjustment. We treat the SQA-produced rank as a soft prior, and update a closed-loop trust weight to adaptively switch and mix between this prior and a robust proportional-fairness allocator, maintaining robustness under prediction failures and runtime disturbances. Simulation results demonstrate that Q-GARS achieves an average weighted completion time improvement of 2.1\% relative to a greedy baseline of the remaining shortest processing-time (SRPT), with performance gains reaching up to 16.8\% in heavy-tailed latency. The adaptive mechanism reduces tail latency under high-variance conditions. In addition, Q-GARS achieves a mean node resource utilization rate of 0.817, which is 1.1 percentage points above the robust baseline (0.806).
Paper Structure (18 sections, 1 theorem, 10 equations, 5 figures, 1 table)

This paper contains 18 sections, 1 theorem, 10 equations, 5 figures, 1 table.

Key Result

proposition 1

Under Assumptions as:bounded_loss and as:convexity, the cumulative scheduling cost of the mixed policy $\mathbf{r}^*(t)$ over a horizon of $T$ decision epochs satisfies

Figures (5)

  • Figure 1: End-to-end request paths and node-local scheduling bottlenecks in a microservice-based user-facing application.
  • Figure 2: Overview of the closed-loop adaptive scheduling framework integrating SQA and real-time feedback control.
  • Figure 3: (a) Histogram showing the distribution of relative performance improvements over the greedy baseline. (b) Scatter plot comparing solution quality, indicating consistent improvements in high-cost regions.
  • Figure 4: (a) Impact of increasing uncertainty level $\alpha$ on average weighted completion time. (b) Cumulative Distribution Function (CDF) of completion times under high volatility ($\alpha=1.5$), highlighting the mitigation of heavy tail risks.
  • Figure 5: System dynamics under a simulated shock interval from time step 300 to 900. (a) The cross-system mean trust parameter $\bar{\lambda}(t)$ drops to trigger the safeguard mode. (b) Q-GARS effectively suppresses the peak surge of the 95th percentile queue backlog. (c) Blocked capacity ratio is minimized and recovers to near-zero levels fastest.

Theorems & Definitions (2)

  • proposition 1
  • proof