Table of Contents
Fetching ...

General Multi-User Distributed Computing

Ali Khalesi

TL;DR

This work addresses the fundamental trade-off between computation, communication, and accuracy in distributed multi-user settings with real-valued target functions. It introduces GMUDC, a unified RKHS-based framework that accommodates arbitrary topologies and nonlinear target transformations, and analyzes performance via quenched and annealed risk formulations. The core contributions include a practical achievability scheme using masked random Fourier features with ridge decoders, and rigorous quenched and annealed fundamental limits that reveal a spectral–coverage duality and an MP-envelope gap for topology-aware design. The results yield explicit design laws showing how risk scales with budgets and topology, including an exponential coverage phase transition, with direct implications for energy-efficient distributed and federated learning in aerospace and edge-networks. Overall, GMUDC provides principled guidance for co-designing computation, communication, and learning under resource constraints, with broad applicability to aeronautical and space networks where energy and data efficiency are critical.

Abstract

This work develops a unified {learning- and information-theoretic} framework for distributed computation and inference across multiple users and servers. The proposed \emph{General Multi-User Distributed Computing (GMUDC)} model characterizes how computation, communication, and accuracy can be jointly optimized when users demand heterogeneous target functions that are arbitrary transformations of shared real-valued subfunctions. Without any separability assumption, and requiring only that each target function lies in a reproducing-kernel Hilbert space associated with a shift-invariant kernel, the framework remains valid for arbitrary connectivity and task-assignment topologies. A dual analysis is introduced: the \emph{quenched design} considers fixed assignments of subfunctions and network topology, while the \emph{annealed design} captures the averaged performance when assignments and links are drawn uniformly at random from a given ensemble. These formulations reveal the fundamental limits governing the trade-offs among computing load, communication load, and reconstruction distortion under computational and communication budgets~$Γ$ and~$Δ$. The analysis establishes a spectral-coverage duality linking generalization capability with network topology and resource allocation, leading to provably efficient and topology-aware distributed designs. The resulting principles provide an \emph{information-energy foundation} for scalable and resource-optimal distributed and federated learning systems, with direct applications to aeronautical, satellite, and edge-intelligent networks where energy and data efficiency are critical.

General Multi-User Distributed Computing

TL;DR

This work addresses the fundamental trade-off between computation, communication, and accuracy in distributed multi-user settings with real-valued target functions. It introduces GMUDC, a unified RKHS-based framework that accommodates arbitrary topologies and nonlinear target transformations, and analyzes performance via quenched and annealed risk formulations. The core contributions include a practical achievability scheme using masked random Fourier features with ridge decoders, and rigorous quenched and annealed fundamental limits that reveal a spectral–coverage duality and an MP-envelope gap for topology-aware design. The results yield explicit design laws showing how risk scales with budgets and topology, including an exponential coverage phase transition, with direct implications for energy-efficient distributed and federated learning in aerospace and edge-networks. Overall, GMUDC provides principled guidance for co-designing computation, communication, and learning under resource constraints, with broad applicability to aeronautical and space networks where energy and data efficiency are critical.

Abstract

This work develops a unified {learning- and information-theoretic} framework for distributed computation and inference across multiple users and servers. The proposed \emph{General Multi-User Distributed Computing (GMUDC)} model characterizes how computation, communication, and accuracy can be jointly optimized when users demand heterogeneous target functions that are arbitrary transformations of shared real-valued subfunctions. Without any separability assumption, and requiring only that each target function lies in a reproducing-kernel Hilbert space associated with a shift-invariant kernel, the framework remains valid for arbitrary connectivity and task-assignment topologies. A dual analysis is introduced: the \emph{quenched design} considers fixed assignments of subfunctions and network topology, while the \emph{annealed design} captures the averaged performance when assignments and links are drawn uniformly at random from a given ensemble. These formulations reveal the fundamental limits governing the trade-offs among computing load, communication load, and reconstruction distortion under computational and communication budgets~ and~. The analysis establishes a spectral-coverage duality linking generalization capability with network topology and resource allocation, leading to provably efficient and topology-aware distributed designs. The resulting principles provide an \emph{information-energy foundation} for scalable and resource-optimal distributed and federated learning systems, with direct applications to aeronautical, satellite, and edge-intelligent networks where energy and data efficiency are critical.

Paper Structure

This paper contains 34 sections, 9 theorems, 208 equations, 1 figure.

Key Result

Theorem 1

Fix a realization $(\mathsf{A},\mathsf{L})$ and assume eq:masked-rff--eq:ridge. There exist absolute constants $C_1,C_2,C_3>0$ such that, for any $\delta'\in(0,1)$ and suitable $\lambda=\lambda(M,\underline{m}_{1}(\mathcal{T}),\ldots,\underline{m}_{K}(\mathcal{T}))$, the following holds with probabi where $m_{\mathrm{harm}} \;\triangleq\; (\tfrac{1}{K}\sum_{k=1}^K \tfrac{1}{\underline{m}_k(\mathca

Figures (1)

  • Figure 1: The $K$-user, $N$-server, $T$-shot General Multi-User Distributed Computing framework considers a setting where each server $n$ computes a subset of subfunctions $\mathcal{S}_n=\{f_{i_{n,1}}(.),f_{i_{n,2}}(.),\hdots , f_{i_{n,|\mathcal{S}_n|}}(.)\}$ and communicates the corresponding results to a subset of users $\mathcal{T}_{n}$. This operation is performed under the computational constraint $|\mathcal{S}_n|\leq \Gamma\leq L$ and the communication constraint $|\mathcal{T}_n|\leq \Delta \leq K$, which correspond to the normalized budgets $\gamma = \Gamma/L$ and $\delta = \Delta/K$. The system aims to minimize the population risk $\mathcal{R}(\mathcal{D}\mid \mathsf{A},\mathsf{L})$, representing the average reconstruction error of the users with respect to their desired target functions. Here, $\mathcal{D}$ denotes the overall encoding and decoding strategy, $\mathsf{A}=(\mathcal{S}_1,\ldots,\mathcal{S}_N)$ specifies the computation assignments, and $\mathsf{L}=(\mathcal{T}_1,\ldots,\mathcal{T}_N)$ defines the communication topology between the servers and the users.

Theorems & Definitions (19)

  • Theorem 1
  • Remark 1
  • Theorem 2
  • Remark 2
  • Lemma 1
  • Lemma 2
  • Lemma 3
  • Lemma 4
  • Lemma 5
  • Proposition 1
  • ...and 9 more