Table of Contents
Fetching ...

DASA: Delay-Adaptive Multi-Agent Stochastic Approximation

Nicolò Dal Fabbro, Arman Adibi, H. Vincent Poor, Sanjeev R. Kulkarni, Aritra Mitra, George J. Pappas

TL;DR

The paper addresses distributed stochastic approximation (SA) where $N$ agents solve a common root-finding problem under asynchronous, potentially unbounded up-link delays and Markovian observations. It introduces DASA, a delay-adaptive algorithm that uses a median-based half-sample aggregation to reduce delay-induced error while leveraging variance reduction from averaging, and it proves a non-asymptotic convergence rate. The key contributions are: (i) a rate that depends on the mixing time $\tau_{mix}$ and the average delay $\tau_{avg}$ rather than the maximum delay $\tau_{max}$, (ii) an $N$-fold linear speedup under Markovian sampling, and (iii) empirical validation in distributed temporal-difference learning showing strong performance and scalability. The results demonstrate robustness to heterogeneous, time-varying delays and offer practical insights for scalable distributed reinforcement learning and stochastic optimization. Overall, the work provides a principled framework for leveraging parallel agents in SA with correlated data while mitigating stragglers and delays.

Abstract

We consider a setting in which $N$ agents aim to speedup a common Stochastic Approximation (SA) problem by acting in parallel and communicating with a central server. We assume that the up-link transmissions to the server are subject to asynchronous and potentially unbounded time-varying delays. To mitigate the effect of delays and stragglers while reaping the benefits of distributed computation, we propose \texttt{DASA}, a Delay-Adaptive algorithm for multi-agent Stochastic Approximation. We provide a finite-time analysis of \texttt{DASA} assuming that the agents' stochastic observation processes are independent Markov chains. Significantly advancing existing results, \texttt{DASA} is the first algorithm whose convergence rate depends only on the mixing time $τ_{mix}$ and on the average delay $τ_{avg}$ while jointly achieving an $N$-fold convergence speedup under Markovian sampling. Our work is relevant for various SA applications, including multi-agent and distributed temporal difference (TD) learning, Q-learning and stochastic optimization with correlated data.

DASA: Delay-Adaptive Multi-Agent Stochastic Approximation

TL;DR

The paper addresses distributed stochastic approximation (SA) where agents solve a common root-finding problem under asynchronous, potentially unbounded up-link delays and Markovian observations. It introduces DASA, a delay-adaptive algorithm that uses a median-based half-sample aggregation to reduce delay-induced error while leveraging variance reduction from averaging, and it proves a non-asymptotic convergence rate. The key contributions are: (i) a rate that depends on the mixing time and the average delay rather than the maximum delay , (ii) an -fold linear speedup under Markovian sampling, and (iii) empirical validation in distributed temporal-difference learning showing strong performance and scalability. The results demonstrate robustness to heterogeneous, time-varying delays and offer practical insights for scalable distributed reinforcement learning and stochastic optimization. Overall, the work provides a principled framework for leveraging parallel agents in SA with correlated data while mitigating stragglers and delays.

Abstract

We consider a setting in which agents aim to speedup a common Stochastic Approximation (SA) problem by acting in parallel and communicating with a central server. We assume that the up-link transmissions to the server are subject to asynchronous and potentially unbounded time-varying delays. To mitigate the effect of delays and stragglers while reaping the benefits of distributed computation, we propose \texttt{DASA}, a Delay-Adaptive algorithm for multi-agent Stochastic Approximation. We provide a finite-time analysis of \texttt{DASA} assuming that the agents' stochastic observation processes are independent Markov chains. Significantly advancing existing results, \texttt{DASA} is the first algorithm whose convergence rate depends only on the mixing time and on the average delay while jointly achieving an -fold convergence speedup under Markovian sampling. Our work is relevant for various SA applications, including multi-agent and distributed temporal difference (TD) learning, Q-learning and stochastic optimization with correlated data.
Paper Structure (6 sections, 6 theorems, 61 equations, 2 figures)

This paper contains 6 sections, 6 theorems, 61 equations, 2 figures.

Key Result

Theorem 1

Consider the update rule of DASA in eq:updateRule and let $T\geq 0$ be such that $\mathcal{X}_T = 1$. There exists a universal constant $C_1\geq 1$ such that for $\alpha \leq \frac{\mu}{C_1L^2\tau_{mix}}$ the following holds:

Figures (2)

  • Figure 1: DASA system model. Agents $1, \dots, N$ cooperatively work to solve the SA problem. At iteration $k$, the server updates ${\boldsymbol{\theta}}_k$ using delayed agents' operators $\{\mathbf{g}({\boldsymbol{\theta}}_{t_{i,k}}, o_{i, t_{i,k}})\}_{i = 1}^N$ and broadcasts ${\boldsymbol{\theta}}_{k+1}$ to the agents.
  • Figure 2: In (a), we show the superior performance of DASA compared to non-adaptive distributed SA under delayed updates ("Delayed") when the number of agents is $N = 10$. In (b), we show the convergence speedup effect comparing a single-agent ($N = 1$) and a multi-agent ($N = 20$) setting. In all simulations, we set $\tau_{max} = 50$.

Theorems & Definitions (14)

  • Remark 1
  • Remark 2
  • Definition 1
  • Theorem 1
  • Lemma 1
  • proof
  • Lemma 2
  • proof
  • Lemma 3
  • proof
  • ...and 4 more