MESC: Re-thinking Algorithmic Priority and/or Criticality Inversions for Heterogeneous MCSs
Jiapeng Guan, Ran Wei, Dean You, Yingquan Wang, Ruizhe Yang, Hui Wang, Zhe Jiang
TL;DR
MESC targets real-time predictability in heterogeneous MCSs by introducing instruction-level preemption for DNN accelerators, mitigating algorithmic priority and criticality inversions. It comprises Gemmini_rt, a micro-architecture with a default config channel, config-copy buffer, and address remapper, plus OS-level task monitor and scheduler, all integrated with a theoretical WCRT model and a local-memory allocation strategy. Empirical results on an FPGA-based setup show 250x and 300x improvements in resolving priority and criticality inversions, respectively, with modest hardware overhead (~5%). The framework offers a practical, end-to-end solution for fine-grained accelerator preemption that preserves data/config consistency and sustains high schedulability in realistic MCS workloads.
Abstract
Modern Mixed-Criticality Systems (MCSs) rely on hardware heterogeneity to satisfy ever-increasing computational demands. However, most of the heterogeneous co-processors are designed to achieve high throughput, with their micro-architectures executing the workloads in a streaming manner. This streaming execution is often non-preemptive or limited-preemptive, preventing tasks' prioritisation based on their importance and resulting in frequent occurrences of algorithmic priority and/or criticality inversions. Such problems present a significant barrier to guaranteeing the systems' real-time predictability, especially when co-processors dominate the execution of the workloads (e.g., DNNs and transformers). In contrast to existing works that typically enable coarse-grained context switch by splitting the workloads/algorithms, we demonstrate a method that provides fine-grained context switch on a widely used open-source DNN accelerator by enabling instruction-level preemption without any workloads/algorithms modifications. As a systematic solution, we build a real system, i.e., Make Each Switch Count (MESC), from the SoC and ISA to the OS kernel. A theoretical model and analysis are also provided for timing guarantees. Experimental results reveal that, compared to conventional MCSs using non-preemptive DNN accelerators, MESC achieved a 250x and 300x speedup in resolving algorithmic priority and criticality inversions, with less than 5\% overhead. To our knowledge, this is the first work investigating algorithmic priority and criticality inversions for MCSs at the instruction level.
