Load Balancing Using Sparse Communication
Gal Mendelson, Xu Kuang
TL;DR
This work introduces CARE, a modular model for load balancing under sparse communication, where a load balancer maintains a state approximation of server queues via a dedicated Approximation component and routes with JSAQ. It proposes MSR-based queue emulation and three communication patterns (RT, DT, ET) to bound maximal approximation error while using far less than full state information. The authors establish diffusion-scale results (SDDP) showing that, for bounded approximation error, the system achieves asymptotically optimal workload and near-optimal delay, linking communication sparsity to performance rigorously. Simulations demonstrate substantial communication reductions (up to ~90%) with competitive or superior performance compared with JSQ, SQ(2), and RR, guiding practical design decisions for data-center-like systems.
Abstract
Load balancing across parallel servers is an important class of congestion control problems that arises in service systems. An effective load balancer relies heavily on accurate, real-time congestion information to make routing decisions. However, obtaining such information can impose significant communication overheads, especially in demanding applications like those found in modern data centers. We introduce a framework for communication-aware load balancing and design new load balancing algorithms that perform exceptionally well even in scenarios with sparse communication patterns. Central to our approach is state approximation, where the load balancer first estimates server states through a communication protocol. Subsequently, it utilizes these approximate states within a load balancing algorithm to determine routing decisions. We demonstrate that by using a novel communication protocol, one can achieve accurate queue length approximation with sparse communication: for a maximal approximation error of x, the communication frequency only needs to be O(1/x^2). We further show, via a diffusion analysis, that a constant maximal approximation error is sufficient for achieving asymptotically optimal performance. Taken together, these results therefore demonstrate that highly performant load balancing is possible with very little communication. Through simulations, we observe that the proposed designs match or surpass the performance of state-of-the-art load balancing algorithms while drastically reducing communication rates by up to 90%.
