Distributed Quasi-Newton Method for Multi-Agent Optimization

Ola Shorinwa; Mac Schwager

Distributed Quasi-Newton Method for Multi-Agent Optimization

Ola Shorinwa, Mac Schwager

TL;DR

This work introduces distributed quasi-Newton methods for multi-agent optimization, namely DQN for unconstrained problems and EC-DQN for equality-constrained problems. Each agent builds and uses an estimate of the aggregate Hessian to compute descent directions locally, while achieving consensus through one-hop communications over a fixed network. The methods leverage dynamic gradient tracking to approximate global gradients and employ BFGS/DFP-like updates to maintain positive-definite inverse-Hessian estimates, with convergence to stationary points demonstrated under standard convexity and smoothness assumptions. Empirical results show superior performance in ill-conditioned settings, offering faster convergence and reduced communication costs compared to existing distributed first- and second-order methods, and broad applicability to problems like logistic regression and basis pursuit denoising.

Abstract

We present a distributed quasi-Newton (DQN) method, which enables a group of agents to compute an optimal solution of a separable multi-agent optimization problem locally using an approximation of the curvature of the aggregate objective function. Each agent computes a descent direction from its local estimate of the aggregate Hessian, obtained from quasi-Newton approximation schemes using the gradient of its local objective function. Moreover, we introduce a distributed quasi-Newton method for equality-constrained optimization (EC-DQN), where each agent takes Karush-Kuhn-Tucker-like update steps to compute an optimal solution. In our algorithms, each agent communicates with its one-hop neighbors over a peer-to-peer communication network to compute a common solution. We prove convergence of our algorithms to a stationary point of the optimization problem. In addition, we demonstrate the competitive empirical convergence of our algorithm in both well-conditioned and ill-conditioned optimization problems, in terms of the computation time and communication cost incurred by each agent for convergence, compared to existing distributed first-order and second-order methods. Particularly, in ill-conditioned problems, our algorithms achieve a faster computation time for convergence, while requiring a lower communication cost, across a range of communication networks with different degrees of connectedness.

Distributed Quasi-Newton Method for Multi-Agent Optimization

TL;DR

Abstract

Paper Structure (22 sections, 6 theorems, 67 equations, 7 figures, 10 tables)

This paper contains 22 sections, 6 theorems, 67 equations, 7 figures, 10 tables.

Introduction
Related Work
Notation and Preliminaries
Problem Formulation and Centralized Quasi-Newton Methods
Distributed Unconstrained Optimization
Distributed Constrained Optimization
Numerical Evaluations of DQN
Logistic Regression
Numerical Evaluations of EC-DQN
Basis Pursuit Denoising
Logistic Regression
Conclusion
Numerical Evaluations of DQN
Quadratic Programming
Well-Conditioned Optimization Problems
...and 7 more sections

Key Result

Lemma 1

Provided that ${\alpha_{\max} < \frac{1 - \lambda}{\lambda^{3} L (1 + r_{\alpha}) q}}$, the sequence ${\{\overline{\bm{x}}^{(k)}\}_{\forall k \geq 0}}$ converges to a limit point for sufficiently large $k$, with: where ${q^{2} = \gamma^{2} \min\{n, N\}}$. Further, as ${k \rightarrow \infty}$, the sequence ${\{\overline{\bm{g}}^{(k)}\}_{\forall k \geq 0}}$, denoting the average gradient of the obj

Figures (7)

Figure 1: Per-iteration convergence error of each agent in the distributed logistic regression problem on a randomly-generated connected communication graph, with ${\kappa = 0.569}$. While the first-order methods DIGing-ATC and $ABm$-DS exhibit slow convergence, DQN and the second-order methods D-Newton Rank-$K$ and ESOM converge faster.
Figure 2: Convergence error of each algorithm in a poorly-conditioned basis pursuit denoising problem, on a randomly-generated communication graph with ${\kappa = 0.566}$. EC-DQN achieves the fastest convergence rate compared to the other algorithms.
Figure 3: Convergence error of each algorithm in constrained logistic regression problems with ${\kappa = 0.566}$. EC-DQN converges the fastest in comparison to the other algorithms.
Figure 4: Convergence error of each agent per iteration in the distributed quadratic programming problem with a well-conditioned Hessian on a randomly-generated connected communication graph, with ${\kappa = 0.569}$.
Figure 5: Convergence error of each agent per iteration in the distributed quadratic programming problem with a poorly-conditioned Hessian on a randomly-generated connected communication graph, with ${\kappa = 0.569}$. The performance of DIGing-ATC and $ABm$-DS degrades notably in poorly-conditioned problems, compared to C-ADMM, DQN, and the second-order methods D-Newton Rank-$K$ and ESOM.
...and 2 more figures

Theorems & Definitions (11)

Lemma 1: Convergence of ${\{\overline{\bm{x}}^{(k)}\}_{\forall k \geq 0}}$
proof
Remark 1
Corollary 1: Convergence of ${\{\overline{\bm{v}}^{(k)}\}_{\forall k \geq 0}}$
Theorem 1: Consensus
proof
Theorem 2: Convergence of the Objective Value
Remark 2
Theorem 3
Lemma 3
...and 1 more

Distributed Quasi-Newton Method for Multi-Agent Optimization

TL;DR

Abstract

Distributed Quasi-Newton Method for Multi-Agent Optimization

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (11)