Table of Contents
Fetching ...

Distributed Quasi-Newton Method for Multi-Agent Optimization

Ola Shorinwa, Mac Schwager

TL;DR

This work introduces distributed quasi-Newton methods for multi-agent optimization, namely DQN for unconstrained problems and EC-DQN for equality-constrained problems. Each agent builds and uses an estimate of the aggregate Hessian to compute descent directions locally, while achieving consensus through one-hop communications over a fixed network. The methods leverage dynamic gradient tracking to approximate global gradients and employ BFGS/DFP-like updates to maintain positive-definite inverse-Hessian estimates, with convergence to stationary points demonstrated under standard convexity and smoothness assumptions. Empirical results show superior performance in ill-conditioned settings, offering faster convergence and reduced communication costs compared to existing distributed first- and second-order methods, and broad applicability to problems like logistic regression and basis pursuit denoising.

Abstract

We present a distributed quasi-Newton (DQN) method, which enables a group of agents to compute an optimal solution of a separable multi-agent optimization problem locally using an approximation of the curvature of the aggregate objective function. Each agent computes a descent direction from its local estimate of the aggregate Hessian, obtained from quasi-Newton approximation schemes using the gradient of its local objective function. Moreover, we introduce a distributed quasi-Newton method for equality-constrained optimization (EC-DQN), where each agent takes Karush-Kuhn-Tucker-like update steps to compute an optimal solution. In our algorithms, each agent communicates with its one-hop neighbors over a peer-to-peer communication network to compute a common solution. We prove convergence of our algorithms to a stationary point of the optimization problem. In addition, we demonstrate the competitive empirical convergence of our algorithm in both well-conditioned and ill-conditioned optimization problems, in terms of the computation time and communication cost incurred by each agent for convergence, compared to existing distributed first-order and second-order methods. Particularly, in ill-conditioned problems, our algorithms achieve a faster computation time for convergence, while requiring a lower communication cost, across a range of communication networks with different degrees of connectedness.

Distributed Quasi-Newton Method for Multi-Agent Optimization

TL;DR

This work introduces distributed quasi-Newton methods for multi-agent optimization, namely DQN for unconstrained problems and EC-DQN for equality-constrained problems. Each agent builds and uses an estimate of the aggregate Hessian to compute descent directions locally, while achieving consensus through one-hop communications over a fixed network. The methods leverage dynamic gradient tracking to approximate global gradients and employ BFGS/DFP-like updates to maintain positive-definite inverse-Hessian estimates, with convergence to stationary points demonstrated under standard convexity and smoothness assumptions. Empirical results show superior performance in ill-conditioned settings, offering faster convergence and reduced communication costs compared to existing distributed first- and second-order methods, and broad applicability to problems like logistic regression and basis pursuit denoising.

Abstract

We present a distributed quasi-Newton (DQN) method, which enables a group of agents to compute an optimal solution of a separable multi-agent optimization problem locally using an approximation of the curvature of the aggregate objective function. Each agent computes a descent direction from its local estimate of the aggregate Hessian, obtained from quasi-Newton approximation schemes using the gradient of its local objective function. Moreover, we introduce a distributed quasi-Newton method for equality-constrained optimization (EC-DQN), where each agent takes Karush-Kuhn-Tucker-like update steps to compute an optimal solution. In our algorithms, each agent communicates with its one-hop neighbors over a peer-to-peer communication network to compute a common solution. We prove convergence of our algorithms to a stationary point of the optimization problem. In addition, we demonstrate the competitive empirical convergence of our algorithm in both well-conditioned and ill-conditioned optimization problems, in terms of the computation time and communication cost incurred by each agent for convergence, compared to existing distributed first-order and second-order methods. Particularly, in ill-conditioned problems, our algorithms achieve a faster computation time for convergence, while requiring a lower communication cost, across a range of communication networks with different degrees of connectedness.
Paper Structure (22 sections, 6 theorems, 67 equations, 7 figures, 10 tables)

This paper contains 22 sections, 6 theorems, 67 equations, 7 figures, 10 tables.

Key Result

Lemma 1

Provided that ${\alpha_{\max} < \frac{1 - \lambda}{\lambda^{3} L (1 + r_{\alpha}) q}}$, the sequence ${\{\overline{\bm{x}}^{(k)}\}_{\forall k \geq 0}}$ converges to a limit point for sufficiently large $k$, with: where ${q^{2} = \gamma^{2} \min\{n, N\}}$. Further, as ${k \rightarrow \infty}$, the sequence ${\{\overline{\bm{g}}^{(k)}\}_{\forall k \geq 0}}$, denoting the average gradient of the obj

Figures (7)

  • Figure 1: Per-iteration convergence error of each agent in the distributed logistic regression problem on a randomly-generated connected communication graph, with ${\kappa = 0.569}$. While the first-order methods DIGing-ATC and $ABm$-DS exhibit slow convergence, DQN and the second-order methods D-Newton Rank-$K$ and ESOM converge faster.
  • Figure 2: Convergence error of each algorithm in a poorly-conditioned basis pursuit denoising problem, on a randomly-generated communication graph with ${\kappa = 0.566}$. EC-DQN achieves the fastest convergence rate compared to the other algorithms.
  • Figure 3: Convergence error of each algorithm in constrained logistic regression problems with ${\kappa = 0.566}$. EC-DQN converges the fastest in comparison to the other algorithms.
  • Figure 4: Convergence error of each agent per iteration in the distributed quadratic programming problem with a well-conditioned Hessian on a randomly-generated connected communication graph, with ${\kappa = 0.569}$.
  • Figure 5: Convergence error of each agent per iteration in the distributed quadratic programming problem with a poorly-conditioned Hessian on a randomly-generated connected communication graph, with ${\kappa = 0.569}$. The performance of DIGing-ATC and $ABm$-DS degrades notably in poorly-conditioned problems, compared to C-ADMM, DQN, and the second-order methods D-Newton Rank-$K$ and ESOM.
  • ...and 2 more figures

Theorems & Definitions (11)

  • Lemma 1: Convergence of ${\{\overline{\bm{x}}^{(k)}\}_{\forall k \geq 0}}$
  • proof
  • Remark 1
  • Corollary 1: Convergence of ${\{\overline{\bm{v}}^{(k)}\}_{\forall k \geq 0}}$
  • Theorem 1: Consensus
  • proof
  • Theorem 2: Convergence of the Objective Value
  • Remark 2
  • Theorem 3
  • Lemma 3
  • ...and 1 more