A consensus-based optimization method for nonsmooth nonconvex programs with approximated gradient descent scheme
Jiazhen Wei, Fan Wu, Wei Bian
TL;DR
This work introduces Extra-Step Consensus-Based Optimization (ESCBO), a derivative-free algorithm that fuses discrete consensus-based optimization with an approximated gradient descent step via forward-difference, enabling global convergence for nonsmooth nonconvex problems while using only function values. The authors prove global consensus occurs at an exponential rate in L^2 and almost surely, establish an error bound between the consensus point and the global minimum that vanishes as β→∞ with rate O(log β/β) under smoothness assumptions, and derive iteration complexity in expectation of O(log(1/ε)) to achieve ε accuracy. The analysis avoids mean-field limits by using probabilistic tools, martingale arguments, and Laplace-type estimates, and is complemented by numerical experiments on classical benchmarks and DNN training. Practically, ESCBO improves over vanilla-CBO in finding global minimizers and is extendable via a fast mini-batch variant (FESCBO) for large-scale learning tasks. The results highlight a robust, gradient-free pathway to tackling challenging nonconvex optimization with provable guarantees and strong empirical performance.
Abstract
In this paper, we are interested in finding the global minimizer of a nonsmooth nonconvex unconstrained optimization problem. By combining the discrete consensus-based optimization (CBO) algorithm and the gradient descent method, we develop a novel CBO algorithm with an extra gradient descent scheme evaluated by the forward-difference technique on the function values, where only the objective function values are used in the proposed algorithm. First, we prove that the proposed algorithm can exhibit global consensus in an exponential rate in two senses and possess a unique global consensus point. Second, we evaluate the error estimate between the objective function value on the global consensus point and its global minimum. In particular, as the parameter $β$ tends to $\infty$, the error converges to zero and the convergence rate is $\mathcal{O}\left(\frac{\logβ}β\right)$. Third, under some suitable assumptions on the objective function, we provide the number of iterations required for the mean square error in expectation to reach the desired accuracy. It is worth underlining that the theoretical analysis in this paper does not use the mean-field limit. Finally, we illustrate the improved efficiency and promising performance of our novel CBO method through some experiments on several nonconvex benchmark problems and the application to train deep neural networks.
