Table of Contents
Fetching ...

NeRCC: Nested-Regression Coded Computing for Resilient Distributed Prediction Serving Systems

Parsa Moradi, Mohammad Ali Maddah-Ali

TL;DR

NeRCC tackles straggler-induced latency in distributed prediction serving by introducing a general three-layer framework: encoding regression to generate coded inputs, centralized computing on those inputs, and decoding regression to recover original predictions. The key idea is to couple two regression problems through regularization terms, producing a smoothing-spline based encoding and decoding mechanism that is optimized jointly. Empirically, NeRCC and its model-agnostic variant outperform the prior BACC approach across LeNet5, RepVGG, and ViT models, reducing mean-squared error by up to 23% and maintaining competitive relative accuracy across a range of stragglers. This approach offers a practical, model-agnostic pathway to resilient, low-latency prediction serving in heterogeneous distributed environments, with significant implications for scalable inference systems.

Abstract

Resilience against stragglers is a critical element of prediction serving systems, tasked with executing inferences on input data for a pre-trained machine-learning model. In this paper, we propose NeRCC, as a general straggler-resistant framework for approximate coded computing. NeRCC includes three layers: (1) encoding regression and sampling, which generates coded data points, as a combination of original data points, (2) computing, in which a cluster of workers run inference on the coded data points, (3) decoding regression and sampling, which approximately recovers the predictions of the original data points from the available predictions on the coded data points. We argue that the overall objective of the framework reveals an underlying interconnection between two regression models in the encoding and decoding layers. We propose a solution to the nested regressions problem by summarizing their dependence on two regularization terms that are jointly optimized. Our extensive experiments on different datasets and various machine learning models, including LeNet5, RepVGG, and Vision Transformer (ViT), demonstrate that NeRCC accurately approximates the original predictions in a wide range of stragglers, outperforming the state-of-the-art by up to 23%.

NeRCC: Nested-Regression Coded Computing for Resilient Distributed Prediction Serving Systems

TL;DR

NeRCC tackles straggler-induced latency in distributed prediction serving by introducing a general three-layer framework: encoding regression to generate coded inputs, centralized computing on those inputs, and decoding regression to recover original predictions. The key idea is to couple two regression problems through regularization terms, producing a smoothing-spline based encoding and decoding mechanism that is optimized jointly. Empirically, NeRCC and its model-agnostic variant outperform the prior BACC approach across LeNet5, RepVGG, and ViT models, reducing mean-squared error by up to 23% and maintaining competitive relative accuracy across a range of stragglers. This approach offers a practical, model-agnostic pathway to resilient, low-latency prediction serving in heterogeneous distributed environments, with significant implications for scalable inference systems.

Abstract

Resilience against stragglers is a critical element of prediction serving systems, tasked with executing inferences on input data for a pre-trained machine-learning model. In this paper, we propose NeRCC, as a general straggler-resistant framework for approximate coded computing. NeRCC includes three layers: (1) encoding regression and sampling, which generates coded data points, as a combination of original data points, (2) computing, in which a cluster of workers run inference on the coded data points, (3) decoding regression and sampling, which approximately recovers the predictions of the original data points from the available predictions on the coded data points. We argue that the overall objective of the framework reveals an underlying interconnection between two regression models in the encoding and decoding layers. We propose a solution to the nested regressions problem by summarizing their dependence on two regularization terms that are jointly optimized. Our extensive experiments on different datasets and various machine learning models, including LeNet5, RepVGG, and Vision Transformer (ViT), demonstrate that NeRCC accurately approximates the original predictions in a wide range of stragglers, outperforming the state-of-the-art by up to 23%.
Paper Structure (19 sections, 5 equations, 8 figures, 1 table)

This paper contains 19 sections, 5 equations, 8 figures, 1 table.

Figures (8)

  • Figure 1: $\texttt{NeRCC}$ framework. It consists of a computing layer sandwiched between two regression layers.
  • Figure 2: The relative accuracy of BACC and $\texttt{NeRCC}$ over a wide range of straggler numbers for $(N, K) = (200, 50)$ and RepVGG architecture.
  • Figure 3: Reconstruction mean square error of BACC and $\texttt{NeRCC}$ across a wider range of straggler numbers for $(N, K) = (200, 50)$ and RepVGG architecture.
  • Figure 4: MSE of reconstruction for different values of $\lambda_{\textrm{dec}}$ in RepVGG architecutre trained on CIFAR10 dataset. The yellow line indicates the MSE of $\texttt{NeRCC}^{Ag}$ framework.
  • Figure 5: MSE of reconstruction for different values of $\lambda_{\textrm{enc}}$ in RepVGG architecutre trained on CIFAR10 dataset. The yellow line indicates the MSE of $\texttt{NeRCC}^{Ag}$ framework.
  • ...and 3 more figures