Performance Optimization in Stream Processing Systems: Experiment-Driven Configuration Tuning for Kafka Streams

David Chen; Sören Henning; Kassiano Matteussi; Rick Rabiser

Performance Optimization in Stream Processing Systems: Experiment-Driven Configuration Tuning for Kafka Streams

David Chen, Sören Henning, Kassiano Matteussi, Rick Rabiser

TL;DR

An experiment-driven approach for automated configuration optimization that combines three phases: Latin Hypercube Sampling with early termination and Simulated Annealing are particularly effective in navigating the configuration space, whereas additional fine-tuning via Hill Climbing yields limited benefits.

Abstract

Configuring stream processing systems for efficient performance, especially in cloud-native deployments, is a challenging and largely manual task. We present an experiment-driven approach for automated configuration optimization that combines three phases: Latin Hypercube Sampling for initial exploration, Simulated Annealing for guided stochastic search, and Hill Climbing for local refinement. The workflow is integrated with the cloud-native Theodolite benchmarking framework, enabling automated experiment orchestration on Kubernetes and early termination of underperforming configurations. In an experimental evaluation with Kafka Streams and a Kubernetes-based cloud testbed, our approach identifies configurations that improve throughput by up to 23% over the default. The results indicate that Latin Hypercube Sampling with early termination and Simulated Annealing are particularly effective in navigating the configuration space, whereas additional fine-tuning via Hill Climbing yields limited benefits.

Performance Optimization in Stream Processing Systems: Experiment-Driven Configuration Tuning for Kafka Streams

TL;DR

Abstract

Paper Structure (17 sections, 5 figures, 1 table)

This paper contains 17 sections, 5 figures, 1 table.

Introduction
Background and Related Work
Parameter Optimization Approaches
Employed Search Techniques
Performance Optimization Approach
Phase 1: Latin Hypercube Sampling
Phase 2: Simulated Annealing
Phase 3: Hill Climbing
Cloud-native Integration
Experimental Pilot Evaluation
Methodology and Setup
Evaluation of Optimization Phases
Latin Hypercube Sampling
Simulated Annealing
Hill Climbing
...and 2 more sections

Figures (5)

Figure 1: Integration of our optimization approach with Theodolite' cloud-native architecture EMSE2022.
Figure 2: Correlation of parameters with throughput.
Figure 3: Evolution of throughput (records/s) over Simulated Annealing iterations for different starting configurations.
Figure 4: Evolution of throughput (records/s) over Hill Climbing iterations for different starting configurations.
Figure 5: Latency and throughput of selected configurations.

Performance Optimization in Stream Processing Systems: Experiment-Driven Configuration Tuning for Kafka Streams

TL;DR

Abstract

Performance Optimization in Stream Processing Systems: Experiment-Driven Configuration Tuning for Kafka Streams

Authors

TL;DR

Abstract

Table of Contents

Figures (5)