Table of Contents
Fetching ...

Towards Multi-dimensional Elasticity for Pervasive Stream Processing Services

Boris Sedlak, Andrea Morichetta, Philipp Raith, Víctor Casamayor Pujol, Schahram Dustdar

TL;DR

The paper tackles SLO fulfillment for streaming workloads on resource-constrained Edge devices by introducing a two-layer elastic architecture: service-specific Local Scaling Agents (LSAs) that perform multi-dimensional scaling across resources and service quality, and a Global Service Optimizer (GSO) that reallocates resources by swapping cores when local options are exhausted. LSAs are trained with a Deep Q Network (DQN) using a virtual training environment built from Linear Gaussian Bayesian Networks (LGBNs) to minimize the discrepancy $\Delta$ between the optimal SLO fulfillment $\phi_{opt}=1.0$ and the observed $\phi(q,m)$, weighted by $w_q$. The approach is evaluated on an OpenCV-based video processing service, showing improvements over a baseline vertical autoscaler under tight resource constraints and demonstrating global optimization gains via core swaps when resources are exhausted. The work advances edge orchestration by enabling practical, multi-dimensional elasticity and resource-sharing decisions within the Computing Continuum, with clear paths for future refinement such as moving away from DQN to LGBN-based inference and enabling continuous action spaces.

Abstract

This paper proposes a hierarchical solution to scale streaming services across quality and resource dimensions. Modern scenarios, like smart cities, heavily rely on the continuous processing of IoT data to provide real-time services and meet application targets (Service Level Objectives -- SLOs). While the tendency is to process data at nearby Edge devices, this creates a bottleneck because resources can only be provisioned up to a limited capacity. To improve elasticity in Edge environments, we propose to scale services in multiple dimensions -- either resources or, alternatively, the service quality. We rely on a two-layer architecture where (1) local, service-specific agents ensure SLO fulfillment through multi-dimensional elasticity strategies; if no more resources can be allocated, (2) a higher-level agent optimizes global SLO fulfillment by swapping resources. The experimental results show promising outcomes, outperforming regular vertical autoscalers, when operating under tight resource constraints.

Towards Multi-dimensional Elasticity for Pervasive Stream Processing Services

TL;DR

The paper tackles SLO fulfillment for streaming workloads on resource-constrained Edge devices by introducing a two-layer elastic architecture: service-specific Local Scaling Agents (LSAs) that perform multi-dimensional scaling across resources and service quality, and a Global Service Optimizer (GSO) that reallocates resources by swapping cores when local options are exhausted. LSAs are trained with a Deep Q Network (DQN) using a virtual training environment built from Linear Gaussian Bayesian Networks (LGBNs) to minimize the discrepancy between the optimal SLO fulfillment and the observed , weighted by . The approach is evaluated on an OpenCV-based video processing service, showing improvements over a baseline vertical autoscaler under tight resource constraints and demonstrating global optimization gains via core swaps when resources are exhausted. The work advances edge orchestration by enabling practical, multi-dimensional elasticity and resource-sharing decisions within the Computing Continuum, with clear paths for future refinement such as moving away from DQN to LGBN-based inference and enabling continuous action spaces.

Abstract

This paper proposes a hierarchical solution to scale streaming services across quality and resource dimensions. Modern scenarios, like smart cities, heavily rely on the continuous processing of IoT data to provide real-time services and meet application targets (Service Level Objectives -- SLOs). While the tendency is to process data at nearby Edge devices, this creates a bottleneck because resources can only be provisioned up to a limited capacity. To improve elasticity in Edge environments, we propose to scale services in multiple dimensions -- either resources or, alternatively, the service quality. We rely on a two-layer architecture where (1) local, service-specific agents ensure SLO fulfillment through multi-dimensional elasticity strategies; if no more resources can be allocated, (2) a higher-level agent optimizes global SLO fulfillment by swapping resources. The experimental results show promising outcomes, outperforming regular vertical autoscalers, when operating under tight resource constraints.

Paper Structure

This paper contains 13 sections, 2 equations, 4 figures, 2 tables.

Figures (4)

  • Figure 1: Processing IoT data at resource-constrained devices; if SLOs are violated, scale either resources or service quality
  • Figure 2: High-level view of the three-step methodology; continuously observing service executions, training an inference model, and using it for multi-dimensional scaling
  • Figure 3: SLO Fulfillment during runtime; every 10 iterations the SLO thresholds and available resources are changed
  • Figure 4: SLO fulfillment of two services operating with resource contention; the GSO swaps resources to globally improve SLOs