A consensus-based optimization method for nonsmooth nonconvex programs with approximated gradient descent scheme

Jiazhen Wei; Fan Wu; Wei Bian

A consensus-based optimization method for nonsmooth nonconvex programs with approximated gradient descent scheme

Jiazhen Wei, Fan Wu, Wei Bian

TL;DR

This work introduces Extra-Step Consensus-Based Optimization (ESCBO), a derivative-free algorithm that fuses discrete consensus-based optimization with an approximated gradient descent step via forward-difference, enabling global convergence for nonsmooth nonconvex problems while using only function values. The authors prove global consensus occurs at an exponential rate in L^2 and almost surely, establish an error bound between the consensus point and the global minimum that vanishes as β→∞ with rate O(log β/β) under smoothness assumptions, and derive iteration complexity in expectation of O(log(1/ε)) to achieve ε accuracy. The analysis avoids mean-field limits by using probabilistic tools, martingale arguments, and Laplace-type estimates, and is complemented by numerical experiments on classical benchmarks and DNN training. Practically, ESCBO improves over vanilla-CBO in finding global minimizers and is extendable via a fast mini-batch variant (FESCBO) for large-scale learning tasks. The results highlight a robust, gradient-free pathway to tackling challenging nonconvex optimization with provable guarantees and strong empirical performance.

Abstract

In this paper, we are interested in finding the global minimizer of a nonsmooth nonconvex unconstrained optimization problem. By combining the discrete consensus-based optimization (CBO) algorithm and the gradient descent method, we develop a novel CBO algorithm with an extra gradient descent scheme evaluated by the forward-difference technique on the function values, where only the objective function values are used in the proposed algorithm. First, we prove that the proposed algorithm can exhibit global consensus in an exponential rate in two senses and possess a unique global consensus point. Second, we evaluate the error estimate between the objective function value on the global consensus point and its global minimum. In particular, as the parameter $β$ tends to $\infty$, the error converges to zero and the convergence rate is $\mathcal{O}\left(\frac{\logβ}β\right)$. Third, under some suitable assumptions on the objective function, we provide the number of iterations required for the mean square error in expectation to reach the desired accuracy. It is worth underlining that the theoretical analysis in this paper does not use the mean-field limit. Finally, we illustrate the improved efficiency and promising performance of our novel CBO method through some experiments on several nonconvex benchmark problems and the application to train deep neural networks.

A consensus-based optimization method for nonsmooth nonconvex programs with approximated gradient descent scheme

TL;DR

Abstract

tends to

, the error converges to zero and the convergence rate is

. Third, under some suitable assumptions on the objective function, we provide the number of iterations required for the mean square error in expectation to reach the desired accuracy. It is worth underlining that the theoretical analysis in this paper does not use the mean-field limit. Finally, we illustrate the improved efficiency and promising performance of our novel CBO method through some experiments on several nonconvex benchmark problems and the application to train deep neural networks.

Paper Structure (19 sections, 18 theorems, 112 equations, 3 figures, 4 tables, 2 algorithms)

This paper contains 19 sections, 18 theorems, 112 equations, 3 figures, 4 tables, 2 algorithms.

Introduction
Preliminaries
Notations
Probabilistic tools
Gradient estimation
A CBO algorithm with an extra approximated gradient descent step
Global consensus analysis
Emergence of global consensus
Emergence of the consensus point
Almost sure convergence
Convergence analysis of the ESCBO algorithm
Convergence rate under stronger condition
Iterate convergence in expectation
Numerical experiments
Validation of global consensus
...and 4 more sections

Key Result

Theorem 2.1

If $\{(X_k,\mathcal{F}_k)\}_{k\geq 0}$ is a submartingale and satisfies $\sup_k\mathbb{E}[|X_k|]<\infty$, then there exists $X_\infty$ such that $\mathbb{E}\left[ |X_\infty|\right]<\infty$ and $X_k\rightarrow X_\infty$ a.s..

Figures (3)

Figure 1: A brief description on the main results of this paper
Figure 2: Positions of particles at some different iterations $k$ in Section \ref{['sec7.1']}
Figure 3: 2D plots of benchmark functions in Sections \ref{['sec7.1']} and \ref{['sec7.2']} with $d=2$.

Theorems & Definitions (37)

Definition 2.1
Definition 2.2
Definition 2.3
Theorem 2.1: Doob's martingale convergence theorem Resnick1999
Proposition 2.2: Borel-Cantelli lemma Resnick1999
Proposition 2.3
Proposition 2.4: Laplace's principle Pinnau2017
Lemma 2.5
proof
Remark 2.1
...and 27 more

A consensus-based optimization method for nonsmooth nonconvex programs with approximated gradient descent scheme

TL;DR

Abstract

A consensus-based optimization method for nonsmooth nonconvex programs with approximated gradient descent scheme

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (37)