Table of Contents
Fetching ...

Process Faster, Pay Less: Functional Isolation for Stream Processing

Eleni Zapridou, Michael Koepf, Panagiotis Sioulas, Ioannis Mytilinis, Anastasia Ailamaki

Abstract

Concurrent workloads often extract insights from high-throughput, real-time data streams. Existing stream processing engines isolate each query's resources, ensuring robust performance but incurring high infrastructure costs. In contrast, sharing work reduces the amount of necessary resources but introduces inter-query interference, leading to performance degradation for some queries. We introduce FunShare, a stream-processing system that improves resource efficiency without compromising performance by dynamically grouping queries based on their performance characteristics. FunShare strategically relaxes query interdependencies and minimizes redundant computation while preserving individual query performance. It achieves this by using an adaptive optimization framework that monitors execution metrics, accurately estimates computation overlaps, and reconfigures execution plans on the fly in response to changes in the underlying data streams. Our evaluation demonstrates that FunShare minimizes resource consumption compared to isolated execution while maintaining or improving throughput for all queries.

Process Faster, Pay Less: Functional Isolation for Stream Processing

Abstract

Concurrent workloads often extract insights from high-throughput, real-time data streams. Existing stream processing engines isolate each query's resources, ensuring robust performance but incurring high infrastructure costs. In contrast, sharing work reduces the amount of necessary resources but introduces inter-query interference, leading to performance degradation for some queries. We introduce FunShare, a stream-processing system that improves resource efficiency without compromising performance by dynamically grouping queries based on their performance characteristics. FunShare strategically relaxes query interdependencies and minimizes redundant computation while preserving individual query performance. It achieves this by using an adaptive optimization framework that monitors execution metrics, accurately estimates computation overlaps, and reconfigures execution plans on the fly in response to changes in the underlying data streams. Our evaluation demonstrates that FunShare minimizes resource consumption compared to isolated execution while maintaining or improving throughput for all queries.
Paper Structure (29 sections, 2 theorems, 16 equations, 11 figures, 1 table, 2 algorithms)

This paper contains 29 sections, 2 theorems, 16 equations, 11 figures, 1 table, 2 algorithms.

Key Result

Theorem 1

Let n be the number of queries running in FunShare, Algorithm alg:split produces a set of sharing groups within at most $n$ executions. Also, in the absence of backpressure, existing sharing groups are not affected.

Figures (11)

  • Figure 1: Sharing using a global plan and the Data--Query model
  • Figure 2: Average throughput during input rate shifts. The workload includes queries with different downstream operations: GROUP BY average and a heavy-weight UDF
  • Figure 3: FunShare architecture
  • Figure 4: Example of the load estimation mechanism
  • Figure 5: Reconfiguration steps for merging two queries
  • ...and 6 more figures

Theorems & Definitions (7)

  • Definition 1: Quality of Service
  • Definition 2: Resources
  • Definition 3: Functional Isolation for Streams
  • Theorem 1
  • proof
  • Theorem 2
  • proof