Table of Contents
Fetching ...

Data-Driven Distributed Optimization via Aggregative Tracking and Deep-Learning

Riccardo Brumali, Guido Carnevale, Giuseppe Notarstefano

TL;DR

This work tackles distributed optimization in aggregative settings where each agent faces unknown local costs and only single-point feedback per iteration. It introduces DELTA, a data-driven method that jointly learns descent directions with local neural networks, performs inexact distributed gradient updates, and tracks the aggregative quantity and its gradient through consensus. The authors prove that, under strong convexity, DELTA converges linearly to a neighborhood of the optimum, with the radius depending on the neural-network approximation error, by leveraging two-time-scale stability analysis and averaging theory. Numerical simulations with 20 agents validate the theoretical results, demonstrate fast consensus, effective learning of descent directions, and robustness to changes in local cost functions.

Abstract

In this paper, we propose a novel distributed data-driven optimization scheme. In detail, we focus on the so-called aggregative framework, a scenario in which a set of agents aim to cooperatively minimize the sum of local costs, each depending on both local decision variables and an aggregation of all of them. We consider a data-driven setup where each objective function is unknown and can be sampled at a single point per iteration (thanks to, e.g., feedback from users or sensors). We address this scenario through a distributed algorithm combining three components: (i) a learning part leveraging neural networks to learn the local costs descent direction, (ii) an optimization routine steering the estimates according to the learned direction to minimize the global cost, and (iii) a tracking mechanism locally reconstructing the unavailable global quantities. Using tools from system theory, i.e., timescale separation and averaging theory, we formally prove that in strongly convex setups, the distributed scheme linearly converges to a neighborhood of the optimum, whose radius depends on the accuracy of the neural networks. Finally, numerical simulations validate the theoretical results.

Data-Driven Distributed Optimization via Aggregative Tracking and Deep-Learning

TL;DR

This work tackles distributed optimization in aggregative settings where each agent faces unknown local costs and only single-point feedback per iteration. It introduces DELTA, a data-driven method that jointly learns descent directions with local neural networks, performs inexact distributed gradient updates, and tracks the aggregative quantity and its gradient through consensus. The authors prove that, under strong convexity, DELTA converges linearly to a neighborhood of the optimum, with the radius depending on the neural-network approximation error, by leveraging two-time-scale stability analysis and averaging theory. Numerical simulations with 20 agents validate the theoretical results, demonstrate fast consensus, effective learning of descent directions, and robustness to changes in local cost functions.

Abstract

In this paper, we propose a novel distributed data-driven optimization scheme. In detail, we focus on the so-called aggregative framework, a scenario in which a set of agents aim to cooperatively minimize the sum of local costs, each depending on both local decision variables and an aggregation of all of them. We consider a data-driven setup where each objective function is unknown and can be sampled at a single point per iteration (thanks to, e.g., feedback from users or sensors). We address this scenario through a distributed algorithm combining three components: (i) a learning part leveraging neural networks to learn the local costs descent direction, (ii) an optimization routine steering the estimates according to the learned direction to minimize the global cost, and (iii) a tracking mechanism locally reconstructing the unavailable global quantities. Using tools from system theory, i.e., timescale separation and averaging theory, we formally prove that in strongly convex setups, the distributed scheme linearly converges to a neighborhood of the optimum, whose radius depends on the accuracy of the neural networks. Finally, numerical simulations validate the theoretical results.

Paper Structure

This paper contains 20 sections, 6 theorems, 84 equations, 6 figures, 1 algorithm.

Key Result

Theorem 1

Consider DELTA and let Assumptions ass:unknown-ass:NN hold. Further, assume $(\theta_{i}^{0},\mathrm{x}_{i}^{0},\mathrm{w}_{i}^{0},\mathrm{z}_{i}^{0}) \in \mathcal{S}_{i}$ for all $i \in \{1,\ldots,N\}$. Then, there exist $\bar{\gamma}, B, \kappa_1 > 0$ and $\kappa_2 \in (0,1)$ such that, for all $\ for all $k \in \mathbb{N}$. $\blacksquare$

Figures (6)

  • Figure 1: Graphical representation of the problem framework.
  • Figure 2: Block diagram of the proposed distributed algorithm, where $\hat{\sigma}_i^k \coloneq \mathrm{w}^k_{i} + \phi_{i}(\mathrm{x}^k_{i})$ and $\hat{g}_i^k \coloneq \mathrm{z}^k_{i} + \nabla_2 \hat{f}_{i}(\mathrm{x}^k_{i},\hat{\sigma}_i^k,\theta_{i})$.
  • Figure 3: a) Comparison between DELTA, and DAGT li2021distributed, and DAGT with ZO gradient approximation by flaxman2004online. b) Evolution over time of $|| \hat{G}(\mathrm{x}^k,\mathrm{w}^k,\theta^k) - \nabla f_{\sigma}(\mathrm{x}^k) ||$
  • Figure 4: Evolution of the tracking dynamics. The dashed lines are the global $\sigma(\mathrm{x}^k)$ (left) and $\sum_{j=1}^N \nabla_2 \hat{f}_{j}(\mathrm{x}^k_{j},\sigma(\mathrm{x}^k),\theta_{j}^k)/N$ (right). The solid lines are their local agents estimates.
  • Figure 5: Agent $i$ perspective: evolution of the learned tangent space (in orange) over the iteration compared to $f_{i}$ (in blue).
  • ...and 1 more figures

Theorems & Definitions (10)

  • Remark 1
  • Remark 2
  • Remark 3
  • Theorem 1
  • Remark 4
  • Lemma 1
  • Lemma 2
  • Lemma 3
  • Lemma 4
  • Theorem 2