General Multi-User Distributed Computing
Ali Khalesi
TL;DR
This work addresses the fundamental trade-off between computation, communication, and accuracy in distributed multi-user settings with real-valued target functions. It introduces GMUDC, a unified RKHS-based framework that accommodates arbitrary topologies and nonlinear target transformations, and analyzes performance via quenched and annealed risk formulations. The core contributions include a practical achievability scheme using masked random Fourier features with ridge decoders, and rigorous quenched and annealed fundamental limits that reveal a spectral–coverage duality and an MP-envelope gap for topology-aware design. The results yield explicit design laws showing how risk scales with budgets and topology, including an exponential coverage phase transition, with direct implications for energy-efficient distributed and federated learning in aerospace and edge-networks. Overall, GMUDC provides principled guidance for co-designing computation, communication, and learning under resource constraints, with broad applicability to aeronautical and space networks where energy and data efficiency are critical.
Abstract
This work develops a unified {learning- and information-theoretic} framework for distributed computation and inference across multiple users and servers. The proposed \emph{General Multi-User Distributed Computing (GMUDC)} model characterizes how computation, communication, and accuracy can be jointly optimized when users demand heterogeneous target functions that are arbitrary transformations of shared real-valued subfunctions. Without any separability assumption, and requiring only that each target function lies in a reproducing-kernel Hilbert space associated with a shift-invariant kernel, the framework remains valid for arbitrary connectivity and task-assignment topologies. A dual analysis is introduced: the \emph{quenched design} considers fixed assignments of subfunctions and network topology, while the \emph{annealed design} captures the averaged performance when assignments and links are drawn uniformly at random from a given ensemble. These formulations reveal the fundamental limits governing the trade-offs among computing load, communication load, and reconstruction distortion under computational and communication budgets~$Γ$ and~$Δ$. The analysis establishes a spectral-coverage duality linking generalization capability with network topology and resource allocation, leading to provably efficient and topology-aware distributed designs. The resulting principles provide an \emph{information-energy foundation} for scalable and resource-optimal distributed and federated learning systems, with direct applications to aeronautical, satellite, and edge-intelligent networks where energy and data efficiency are critical.
