Table of Contents
Fetching ...

Interference-Aware Edge Runtime Prediction with Conformal Matrix Completion

Tianshu Huang, Arjun Ramesh, Emily Ruppel, Nuno Pereira, Anthony Rowe, Carlee Joe-Wong

TL;DR

The paper addresses edge-runtime prediction under interference in heterogeneous devices by formulating runtime estimation as an interference-aware matrix completion problem. It proposes Pitot, a two-tower matrix factorization model that incorporates workload/platform features, a low-rank interference term, and log-residual training, together with Conformalized Quantile Regression for calibrated, tight uncertainty bounds. On a WebAssembly-based dataset spanning 24 devices and 249 workloads, Pitot achieves about 5.2% mean absolute percent error and provides tight, calibrated prediction intervals, outperforming baselines by up to 2x in accuracy and significantly improving bound tightness. The work demonstrates strong data efficiency, interpretable embeddings, and practical potential for edge orchestration, with open avenues for online learning and broader benchmarking datasets.

Abstract

Accurately estimating workload runtime is a longstanding goal in computer systems, and plays a key role in efficient resource provisioning, latency minimization, and various other system management tasks. Runtime prediction is particularly important for managing increasingly complex distributed systems in which more sophisticated processing is pushed to the edge in search of better latency. Previous approaches for runtime prediction in edge systems suffer from poor data efficiency or require intensive instrumentation; these challenges are compounded in heterogeneous edge computing environments, where historical runtime data may be sparsely available and instrumentation is often challenging. Moreover, edge computing environments often feature multi-tenancy due to limited resources at the network edge, potentially leading to interference between workloads and further complicating the runtime prediction problem. Drawing from insights across machine learning and computer systems, we design a matrix factorization-inspired method that generates accurate interference-aware predictions with tight provably-guaranteed uncertainty bounds. We validate our method on a novel WebAssembly runtime dataset collected from 24 unique devices, achieving a prediction error of 5.2% -- 2x better than a naive application of existing methods.

Interference-Aware Edge Runtime Prediction with Conformal Matrix Completion

TL;DR

The paper addresses edge-runtime prediction under interference in heterogeneous devices by formulating runtime estimation as an interference-aware matrix completion problem. It proposes Pitot, a two-tower matrix factorization model that incorporates workload/platform features, a low-rank interference term, and log-residual training, together with Conformalized Quantile Regression for calibrated, tight uncertainty bounds. On a WebAssembly-based dataset spanning 24 devices and 249 workloads, Pitot achieves about 5.2% mean absolute percent error and provides tight, calibrated prediction intervals, outperforming baselines by up to 2x in accuracy and significantly improving bound tightness. The work demonstrates strong data efficiency, interpretable embeddings, and practical potential for edge orchestration, with open avenues for online learning and broader benchmarking datasets.

Abstract

Accurately estimating workload runtime is a longstanding goal in computer systems, and plays a key role in efficient resource provisioning, latency minimization, and various other system management tasks. Runtime prediction is particularly important for managing increasingly complex distributed systems in which more sophisticated processing is pushed to the edge in search of better latency. Previous approaches for runtime prediction in edge systems suffer from poor data efficiency or require intensive instrumentation; these challenges are compounded in heterogeneous edge computing environments, where historical runtime data may be sparsely available and instrumentation is often challenging. Moreover, edge computing environments often feature multi-tenancy due to limited resources at the network edge, potentially leading to interference between workloads and further complicating the runtime prediction problem. Drawing from insights across machine learning and computer systems, we design a matrix factorization-inspired method that generates accurate interference-aware predictions with tight provably-guaranteed uncertainty bounds. We validate our method on a novel WebAssembly runtime dataset collected from 24 unique devices, achieving a prediction error of 5.2% -- 2x better than a naive application of existing methods.

Paper Structure

This paper contains 90 sections, 1 theorem, 14 equations, 12 figures, 3 tables.

Key Result

Proposition 1

Since the log-loss (Eq. eq:log_loss) is convex for $\bar{m}_i$ and $\bar{p}_j$, we can efficiently learn the linear scaling model $\log(\bar{C}_{ij}) = \bar{w}_i + \bar{p}_j$ from $C_{ij}^*$ by alternating minimization over $\bar{w}_i$ and $\bar{p}_j$ using the update rule with a similar rule applying for $\bar{p}_j$.

Figures (12)

  • Figure 1: Log-histogram of interference effects in our dataset, sorted by the number of interfering workloads; we observe up to a 20$\times$ slowdown in randomly sampled benchmark combinations.
  • Figure 2: Illustration of Pitot's interpreted profiling, matrix factorization, and interference model. Workload and platform embeddings $\bm{w}_i$, $\bm{p}_j$ are first computed by embedding networks $f_w, f_p$ from input features $\bm{x}_w^{(i)}, \bm{x}_p^{(j)}$ concatenated to learned features $\bm{\varphi}_w^{(i)}, \bm{\varphi}_p^{(j)}$. Then, for each (workload, platform) pair, Pitot adds the inner product $\bm{w}_i^T\bm{p}_i$ to the baseline $\bar{C}_{ij}$. If interfering modules are present, Pitot also computes the interference susceptibility $\bm{w}_i^T\bm{v}_s^t$ and magnitude $\bm{w}_k^T\bm{v}_g^{(t)}$ for each, and adds an interference term (Eq. \ref{['eq:multi_interference']}). The resulting prediction is then compared to the observed runtime to train our model weights $\{\bm{\theta}_w, \bm{\theta}_p, \bm{\varphi}_w, \bm{\varphi}_p\}$.
  • Figure 3: Heterogeneous cluster test bench used to collect our dataset; our cluster includes Intel and AMD-based x86 computers, ARM A-class single board computers, as well as a RISC-V SBC and an ARM M-class microcontroller.
  • Figure 4: Ablations of key aspects of our Pitot. Each figure shows the mean absolute percent error ($\pm 2$ standard errors) for varying amounts of training data; error for test data with and without interference are shown separately.
  • Figure 5: Bound tightness ($\pm 2$ standard errors) of our conformalized quantile regression algorithm compared to naive approaches for varying miscoverage rates when trained on 50% of the dataset.
  • ...and 7 more figures

Theorems & Definitions (1)

  • Proposition 1