Towards Multi-dimensional Elasticity for Pervasive Stream Processing Services
Boris Sedlak, Andrea Morichetta, Philipp Raith, Víctor Casamayor Pujol, Schahram Dustdar
TL;DR
The paper tackles SLO fulfillment for streaming workloads on resource-constrained Edge devices by introducing a two-layer elastic architecture: service-specific Local Scaling Agents (LSAs) that perform multi-dimensional scaling across resources and service quality, and a Global Service Optimizer (GSO) that reallocates resources by swapping cores when local options are exhausted. LSAs are trained with a Deep Q Network (DQN) using a virtual training environment built from Linear Gaussian Bayesian Networks (LGBNs) to minimize the discrepancy $\Delta$ between the optimal SLO fulfillment $\phi_{opt}=1.0$ and the observed $\phi(q,m)$, weighted by $w_q$. The approach is evaluated on an OpenCV-based video processing service, showing improvements over a baseline vertical autoscaler under tight resource constraints and demonstrating global optimization gains via core swaps when resources are exhausted. The work advances edge orchestration by enabling practical, multi-dimensional elasticity and resource-sharing decisions within the Computing Continuum, with clear paths for future refinement such as moving away from DQN to LGBN-based inference and enabling continuous action spaces.
Abstract
This paper proposes a hierarchical solution to scale streaming services across quality and resource dimensions. Modern scenarios, like smart cities, heavily rely on the continuous processing of IoT data to provide real-time services and meet application targets (Service Level Objectives -- SLOs). While the tendency is to process data at nearby Edge devices, this creates a bottleneck because resources can only be provisioned up to a limited capacity. To improve elasticity in Edge environments, we propose to scale services in multiple dimensions -- either resources or, alternatively, the service quality. We rely on a two-layer architecture where (1) local, service-specific agents ensure SLO fulfillment through multi-dimensional elasticity strategies; if no more resources can be allocated, (2) a higher-level agent optimizes global SLO fulfillment by swapping resources. The experimental results show promising outcomes, outperforming regular vertical autoscalers, when operating under tight resource constraints.
