Table of Contents
Fetching ...

Distributed Derivative-Free Optimization Using Inexact ADMM and Trust-Region Methods

Damilola Fasiku, Wentao Tang

TL;DR

This paper advances distributed derivative-free optimization by combining a two-level inexact ADMM framework with trust-region DFO subsolvers to tackle high-dimensional, nonconvex problems with general linear couplings. By decoupling variables into blocks and using an inner ADMM loop alongside an outer method-of-multipliers loop, the approach handles complex constraint structures beyond consensus while solving subproblems inexactly to maintain efficiency. The authors establish global convergence to approximate stationary points in both smooth and nonsmooth settings, leveraging the trust-region radius as a practical measure of stationarity and providing concrete parameter-tuning guidance. Numerical experiments on convex and nonconvex problems, including distributed neural network training with consensus, demonstrate favorable scalability and competitive performance relative to monolithic solvers, underscoring the method’s potential for large-scale, black-box optimization in distributed environments.

Abstract

To reduce complexity and achieve scalable performance in high-dimensional black-box settings, we propose a distributed method for nonconvex derivative-free optimization of continuous variables with an additively separable objective, subject to linear equality constraints. The approach is built upon the alternating direction method of multipliers (ADMM) as the distributed optimization framework. To handle general, potentially complicating linear equality constraints beyond the standard ADMM formulation, we employ a two-level ADMM structure: an inner layer that performs sequential ADMM updates, and an outer layer that drives an introduced slack variable to zero via the method of multipliers. In addition, each subproblem is solved inexactly using a derivative-free trust-region solver, ensuring suboptimality within a decreasing, theoretically controlled error tolerance. This inexactness is critical for both computational efficiency and practical applicability in black-box settings, where exact solutions are impractical or overly expensive. We establish theoretical convergence of the proposed approach to an approximate solution, and demonstrate improved computational efficiency over monolithic derivative-free optimization approaches on challenging high-dimensional benchmarks, as well as effective performance on a distributed learning problem.

Distributed Derivative-Free Optimization Using Inexact ADMM and Trust-Region Methods

TL;DR

This paper advances distributed derivative-free optimization by combining a two-level inexact ADMM framework with trust-region DFO subsolvers to tackle high-dimensional, nonconvex problems with general linear couplings. By decoupling variables into blocks and using an inner ADMM loop alongside an outer method-of-multipliers loop, the approach handles complex constraint structures beyond consensus while solving subproblems inexactly to maintain efficiency. The authors establish global convergence to approximate stationary points in both smooth and nonsmooth settings, leveraging the trust-region radius as a practical measure of stationarity and providing concrete parameter-tuning guidance. Numerical experiments on convex and nonconvex problems, including distributed neural network training with consensus, demonstrate favorable scalability and competitive performance relative to monolithic solvers, underscoring the method’s potential for large-scale, black-box optimization in distributed environments.

Abstract

To reduce complexity and achieve scalable performance in high-dimensional black-box settings, we propose a distributed method for nonconvex derivative-free optimization of continuous variables with an additively separable objective, subject to linear equality constraints. The approach is built upon the alternating direction method of multipliers (ADMM) as the distributed optimization framework. To handle general, potentially complicating linear equality constraints beyond the standard ADMM formulation, we employ a two-level ADMM structure: an inner layer that performs sequential ADMM updates, and an outer layer that drives an introduced slack variable to zero via the method of multipliers. In addition, each subproblem is solved inexactly using a derivative-free trust-region solver, ensuring suboptimality within a decreasing, theoretically controlled error tolerance. This inexactness is critical for both computational efficiency and practical applicability in black-box settings, where exact solutions are impractical or overly expensive. We establish theoretical convergence of the proposed approach to an approximate solution, and demonstrate improved computational efficiency over monolithic derivative-free optimization approaches on challenging high-dimensional benchmarks, as well as effective performance on a distributed learning problem.

Paper Structure

This paper contains 19 sections, 8 theorems, 76 equations, 4 figures, 2 tables, 3 algorithms.

Key Result

Lemma 1

Under Assumptions ass:smooth-lip-gradient and ass:hessian-bounded, the trust region radius satisfies and all limit points of the sequence $\{x^{q}\}$ are first-order stationary; hence

Figures (4)

  • Figure 1: An illustration of the two-layer ADMM-DFO algorithm.
  • Figure 2: Convergence of the augmented Lagrangian and residual norms for the 1200-dimensional ARWHEAD problem. Colors indicate progression of outer iterations from blue (initial) to red (final).
  • Figure 3: Convergence of the augmented Lagrangian and residual norms for the 500-dimensional Rosenbrock problem. Colors indicate progression of outer iterations from blue (initial) to red (final).
  • Figure 4: Convergence of the augmented Lagrangian and residual norms for the nonconvex nonsmooth problem. Colors indicate progression of outer iterations from blue (initial) to red (final).

Theorems & Definitions (13)

  • Lemma 1: Global convergence of Algorithm \ref{['dfo_smooth_alg']}, conn2009
  • Lemma 2
  • Lemma 3: Smooth case: gradient norm--trust-region radius bound
  • proof
  • Remark 1
  • Lemma 4: Convergence of inner iterations for the smooth case
  • proof
  • Remark 2
  • Lemma 5: Convergence of outer iterations for the smooth case
  • proof
  • ...and 3 more