Distributed Quasi-Newton Method for Multi-Agent Optimization
Ola Shorinwa, Mac Schwager
TL;DR
This work introduces distributed quasi-Newton methods for multi-agent optimization, namely DQN for unconstrained problems and EC-DQN for equality-constrained problems. Each agent builds and uses an estimate of the aggregate Hessian to compute descent directions locally, while achieving consensus through one-hop communications over a fixed network. The methods leverage dynamic gradient tracking to approximate global gradients and employ BFGS/DFP-like updates to maintain positive-definite inverse-Hessian estimates, with convergence to stationary points demonstrated under standard convexity and smoothness assumptions. Empirical results show superior performance in ill-conditioned settings, offering faster convergence and reduced communication costs compared to existing distributed first- and second-order methods, and broad applicability to problems like logistic regression and basis pursuit denoising.
Abstract
We present a distributed quasi-Newton (DQN) method, which enables a group of agents to compute an optimal solution of a separable multi-agent optimization problem locally using an approximation of the curvature of the aggregate objective function. Each agent computes a descent direction from its local estimate of the aggregate Hessian, obtained from quasi-Newton approximation schemes using the gradient of its local objective function. Moreover, we introduce a distributed quasi-Newton method for equality-constrained optimization (EC-DQN), where each agent takes Karush-Kuhn-Tucker-like update steps to compute an optimal solution. In our algorithms, each agent communicates with its one-hop neighbors over a peer-to-peer communication network to compute a common solution. We prove convergence of our algorithms to a stationary point of the optimization problem. In addition, we demonstrate the competitive empirical convergence of our algorithm in both well-conditioned and ill-conditioned optimization problems, in terms of the computation time and communication cost incurred by each agent for convergence, compared to existing distributed first-order and second-order methods. Particularly, in ill-conditioned problems, our algorithms achieve a faster computation time for convergence, while requiring a lower communication cost, across a range of communication networks with different degrees of connectedness.
