Table of Contents
Fetching ...

TempoNet: Slack-Quantized Transformer-Guided Reinforcement Scheduler for Adaptive Deadline-Centric Real-Time Dispatchs

Rong Fu, Yibo Meng, Guangzhen Yao, Jiaxuan Lu, Zeyu Zhang, Zhaolu Kang, Ziming Guo, Jia Yee Tan, Xiaojing Du, Simon James Fong

TL;DR

TempoNet is presented, a reinforcement learning scheduler that pairs a permutation-invariant Transformer with a deep Q-approximation, establishing a practical framework for Transformer-based decision making in high-throughput real-time scheduling.

Abstract

Real-time schedulers must reason about tight deadlines under strict compute budgets. We present TempoNet, a reinforcement learning scheduler that pairs a permutation-invariant Transformer with a deep Q-approximation. An Urgency Tokenizer discretizes temporal slack into learnable embeddings, stabilizing value learning and capturing deadline proximity. A latency-aware sparse attention stack with blockwise top-k selection and locality-sensitive chunking enables global reasoning over unordered task sets with near-linear scaling and sub-millisecond inference. A multicore mapping layer converts contextualized Q-scores into processor assignments through masked-greedy selection or differentiable matching. Extensive evaluations on industrial mixed-criticality traces and large multiprocessor settings show consistent gains in deadline fulfillment over analytic schedulers and neural baselines, together with improved optimization stability. Diagnostics include sensitivity analyses for slack quantization, attention-driven policy interpretation, hardware-in-the-loop and kernel micro-benchmarks, and robustness under stress with simple runtime mitigations; we also report sample-efficiency benefits from behavioral-cloning pretraining and compatibility with an actor-critic variant without altering the inference pipeline. These results establish a practical framework for Transformer-based decision making in high-throughput real-time scheduling.

TempoNet: Slack-Quantized Transformer-Guided Reinforcement Scheduler for Adaptive Deadline-Centric Real-Time Dispatchs

TL;DR

TempoNet is presented, a reinforcement learning scheduler that pairs a permutation-invariant Transformer with a deep Q-approximation, establishing a practical framework for Transformer-based decision making in high-throughput real-time scheduling.

Abstract

Real-time schedulers must reason about tight deadlines under strict compute budgets. We present TempoNet, a reinforcement learning scheduler that pairs a permutation-invariant Transformer with a deep Q-approximation. An Urgency Tokenizer discretizes temporal slack into learnable embeddings, stabilizing value learning and capturing deadline proximity. A latency-aware sparse attention stack with blockwise top-k selection and locality-sensitive chunking enables global reasoning over unordered task sets with near-linear scaling and sub-millisecond inference. A multicore mapping layer converts contextualized Q-scores into processor assignments through masked-greedy selection or differentiable matching. Extensive evaluations on industrial mixed-criticality traces and large multiprocessor settings show consistent gains in deadline fulfillment over analytic schedulers and neural baselines, together with improved optimization stability. Diagnostics include sensitivity analyses for slack quantization, attention-driven policy interpretation, hardware-in-the-loop and kernel micro-benchmarks, and robustness under stress with simple runtime mitigations; we also report sample-efficiency benefits from behavioral-cloning pretraining and compatibility with an actor-critic variant without altering the inference pipeline. These results establish a practical framework for Transformer-based decision making in high-throughput real-time scheduling.
Paper Structure (157 sections, 10 theorems, 99 equations, 12 figures, 20 tables, 1 algorithm)

This paper contains 157 sections, 10 theorems, 99 equations, 12 figures, 20 tables, 1 algorithm.

Key Result

Theorem A.1

Given a task distribution $\mathcal{D}$ and a target scheduling function that is $L$-Lipschitz with respect to slack, there exists a non-zero lower bound on the expected miss rate gap between continuous and quantized architectures: where $\mathcal{M}(\pi)$ denotes the miss rate of policy $\pi$, $L$ is the Lipschitz continuity constant of the optimal scheduling manifold, and $\Delta = S_{\max}/Q$

Figures (12)

  • Figure 1: Overview of the TempoNet architecture for adaptive deadline-centric real-time dispatching. The pipeline initiates with the Urgency Tokenizer (UT), which transforms continuous per-job slack $s_i(t)$ into a discrete vocabulary via Slack Quantization (clip and floor) and retrieves learned Urgency Tokens$\mathbf{x}_i(t)$ from an embedding matrix $\mathbf{E}$. These tokens are gathered into a Token Assembly matrix $\mathbf{X}(t)$, maintaining permutation invariance. At the core, a Transformer Encoder stacks $L$ blocks of Multi-Head Attention and Position-wise Feed-Forward Networks to generate contextualized task representations $\mathbf{H}^{(L)}$. The Q-Value Projection layer maps these representations to per-token Q-scores $\mathbf{q}(t)$, which are then passed through a Multicore Mapping module that utilizes an Iterative Masked-Greedy or bipartite matching strategy to determine the final action $a_t$. The framework is optimized via a Deep Q-Learning loop, where experiences are stored in a Replay Buffer$\mathcal{D}$ to update the primary network $Q_\theta$ against a soft-updated Target Network$Q_{\theta^-}$.
  • Figure 2: Attention-Criticality Correlation Analysis
  • Figure 3: Attention Focus Distribution Across Tasks heatmap
  • Figure 4: Computational Time Scaling with System Size
  • Figure 5: Entropy Distribution Across Transformer Layers
  • ...and 7 more figures

Theorems & Definitions (17)

  • Theorem A.1
  • proof
  • Theorem C.1
  • proof
  • Lemma D.1
  • proof
  • Theorem D.2
  • proof
  • Lemma G.1: Regret-to-Bellman residual decomposition
  • Lemma G.2: Approximation bias from slack quantization
  • ...and 7 more