Distributed Inexact Newton Method with Adaptive Step Sizes

Dusan Jakovetic; Natasa Krejic; Greta Malaspina

Distributed Inexact Newton Method with Adaptive Step Sizes

Dusan Jakovetic, Natasa Krejic, Greta Malaspina

TL;DR

This work tackles distributed optimization on networks for two common problems: distributed personalized optimization using a penalty formulation Φβ and distributed consensus optimization via a min-sum objective. It introduces DINAS, a Distributed Inexact Newton method with Adaptive Step sizes, which computes Newton directions inexactly through a distributed Jacobi Overrelaxation method and uses a Polyak-inspired adaptive step to guarantee global convergence while maintaining fast local convergence. The authors prove global convergence and establish rates (linear, superlinear, quadratic) in the personalized setting, derive linear convergence with respect to communication rounds, and show that solving a sequence of penalty problems with decreasing β yields convergence to the consensus solution. They also provide a centralized inexact Newton analysis with adaptive steps, and deliver extensive numerical results demonstrating significant improvements in both computation and communication costs over state-of-the-art methods, including for ill-conditioned and large-scale problems. Overall, the method reduces global constants requirements, avoids Hessian inverses at each node, and achieves substantial practical gains in distributed second-order optimization.

Abstract

We consider two formulations for distributed optimization wherein $N$ agents in a generic connected network solve a problem of common interest: distributed personalized optimization and consensus optimization. A new method termed DINAS (Distributed Inexact Newton method with Adaptive Stepsize) is proposed. DINAS employs large adaptively computed step-sizes, requires a reduced global parameters knowledge with respect to existing alternatives, and can operate without any local Hessian inverse calculations nor Hessian communications. When solving personalized distributed learning formulations, DINAS achieves quadratic convergence with respect to computational cost and linear convergence with respect to communication cost, the latter rate being independent of the local functions condition numbers or of the network topology. When solving consensus optimization problems, DINAS is shown to converge to the global solution. Extensive numerical experiments demonstrate significant improvements of DINAS over existing alternatives. As a result of independent interest, we provide for the first time convergence analysis of the Newton method with the adaptive Polyak's step-size when the Newton direction is computed inexactly in centralized environment.

Distributed Inexact Newton Method with Adaptive Step Sizes

TL;DR

Abstract

We consider two formulations for distributed optimization wherein

agents in a generic connected network solve a problem of common interest: distributed personalized optimization and consensus optimization. A new method termed DINAS (Distributed Inexact Newton method with Adaptive Stepsize) is proposed. DINAS employs large adaptively computed step-sizes, requires a reduced global parameters knowledge with respect to existing alternatives, and can operate without any local Hessian inverse calculations nor Hessian communications. When solving personalized distributed learning formulations, DINAS achieves quadratic convergence with respect to computational cost and linear convergence with respect to communication cost, the latter rate being independent of the local functions condition numbers or of the network topology. When solving consensus optimization problems, DINAS is shown to converge to the global solution. Extensive numerical experiments demonstrate significant improvements of DINAS over existing alternatives. As a result of independent interest, we provide for the first time convergence analysis of the Newton method with the adaptive Polyak's step-size when the Newton direction is computed inexactly in centralized environment.

Paper Structure (10 sections, 9 theorems, 77 equations, 4 figures)

This paper contains 10 sections, 9 theorems, 77 equations, 4 figures.

Introduction
Model and preliminaries
Algorithm DINAS: Personalized distributed optimization
Convergence analysis of DINAS for personalized distributed optimization
Analysis of inexact centralized Newton method with Polyak's adaptive step size
Convergence analysis for DINAS: Consensus optimization
Numerical Results
Numerical results for distributed personalized optimization
Comparison with Exact Methods for consensus optimization
Conclusions

Key Result

Lemma 4.1

Let Assumptions ass:network - ass:objetive hold. Then the following statements hold:

Figures (4)

Figure 1: Choice of the forcing terms, Logistic Regression
Figure 2: Total cost, Logistic Regression
Figure 3: Total cost, Logistic Regression, LSVT dataset
Figure 4: Total cost, quadratic problem

Theorems & Definitions (24)

Lemma 4.1
proof
Theorem 4.1
proof
Remark 4.1
Remark 4.2
Theorem 4.2
proof
Theorem 4.3
proof
...and 14 more

Distributed Inexact Newton Method with Adaptive Step Sizes

TL;DR

Abstract

Distributed Inexact Newton Method with Adaptive Step Sizes

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (24)