Table of Contents
Fetching ...

Multi-Objective Optimization of Consumer Group Autoscaling in Message Broker Systems

Diogo Landau, Nishant Saurabh, Xavier Andrade, Jorge G Barbosa

TL;DR

This work tackles auto-scaling of consumer groups in message brokers under skewed, variable-size workloads by modeling per-partition production as variable item sizes within a Variable Item Size Bin Packing (VISBP) framework. It introduces the Rscore metric to quantify rebalance costs and proposes four Modified Any Fit BP heuristics to minimize both operational and migration costs. Extensive experiments compare against Kafka's assignment, showing dramatically lower 90th percentile latency at equivalent resource use (4.52s vs 217s) and favorable Pareto-front trade-offs as load variability grows. A real-world Kafka-based autoscaler demonstrates deterministic, load-aware partition assignments with practical deployment considerations. Overall, the approach provides a principled, scalable method to deterministically balance throughput and reconfiguration costs in streaming systems.

Abstract

Message brokers often mediate communication between data producers and consumers by adding variable-sized messages to ordered distributed queues. Our goal is to determine the number of consumers and consumer-partition assignments needed to ensure that the rate of data consumption keeps up with the rate of data production. We model the problem as a variable item size bin packing problem. As the rate of production varies, new consumer-partition assignments are computed, which may require rebalancing a partition from one consumer to another. While rebalancing a queue, the data being produced into the queue is not read leading to additional latency costs. As such, we focus on the multi-objective optimization cost of minimizing both the number of consumers and queue migrations. We present a variety of algorithms and compare them to established bin packing heuristics for this application. Comparing our proposed consumer group assignment strategy with Kafka's, a commonly employed strategy, our strategy presents a 90th percentile latency of 4.52s compared to Kafka's 217s with both using the same amount of consumers. Kafka's assignment strategy only improved the consumer group's performance with regards to latency with configurations that used at least 60% more resources than our approach.

Multi-Objective Optimization of Consumer Group Autoscaling in Message Broker Systems

TL;DR

This work tackles auto-scaling of consumer groups in message brokers under skewed, variable-size workloads by modeling per-partition production as variable item sizes within a Variable Item Size Bin Packing (VISBP) framework. It introduces the Rscore metric to quantify rebalance costs and proposes four Modified Any Fit BP heuristics to minimize both operational and migration costs. Extensive experiments compare against Kafka's assignment, showing dramatically lower 90th percentile latency at equivalent resource use (4.52s vs 217s) and favorable Pareto-front trade-offs as load variability grows. A real-world Kafka-based autoscaler demonstrates deterministic, load-aware partition assignments with practical deployment considerations. Overall, the approach provides a principled, scalable method to deterministically balance throughput and reconfiguration costs in streaming systems.

Abstract

Message brokers often mediate communication between data producers and consumers by adding variable-sized messages to ordered distributed queues. Our goal is to determine the number of consumers and consumer-partition assignments needed to ensure that the rate of data consumption keeps up with the rate of data production. We model the problem as a variable item size bin packing problem. As the rate of production varies, new consumer-partition assignments are computed, which may require rebalancing a partition from one consumer to another. While rebalancing a queue, the data being produced into the queue is not read leading to additional latency costs. As such, we focus on the multi-objective optimization cost of minimizing both the number of consumers and queue migrations. We present a variety of algorithms and compare them to established bin packing heuristics for this application. Comparing our proposed consumer group assignment strategy with Kafka's, a commonly employed strategy, our strategy presents a 90th percentile latency of 4.52s compared to Kafka's 217s with both using the same amount of consumers. Kafka's assignment strategy only improved the consumer group's performance with regards to latency with configurations that used at least 60% more resources than our approach.
Paper Structure (22 sections, 17 equations, 17 figures, 6 tables, 1 algorithm)

This paper contains 22 sections, 17 equations, 17 figures, 6 tables, 1 algorithm.

Figures (17)

  • Figure 1: Representation of data production and consumption domains within a message broker environment.
  • Figure 2: System's architecture based on Kafka consisting of Monitor, Controller and Consumer processes.
  • Figure 3: Consumer Insert Cycle.
  • Figure 4: Controller State Machine.
  • Figure 5: Latency histogram for MWF and kd_27.
  • ...and 12 more figures