CB$^2$O: Consensus-Based Bi-Level Optimization
Nicolás García Trillos, Sixu Li, Konstantin Riedl, Yuhua Zhu
TL;DR
This work introduces CB2O, a derivative-free, multi-particle method for nonconvex bi-level optimization where the upper-level objective $G$ is minimized over the global lower-level minimizers $\Theta$ of $L$. CB2O constructs a consensus point $m^{G,L}_{\alpha,\beta}(\rho)$ by selecting a $\beta$-quantile of $L$ and applying a Laplace-type weighting of $G$, guiding particles toward the global bilevel minimizer $\theta_{good}^*$. The authors prove existence and regularity of the mean-field CB2O dynamics, and establish global convergence in mean-field law to $\theta_{good}^*$, using a novel quantitative quantiled Laplace principle (Q2LP) and a stability estimate for the consensus point under combined Wasserstein and $L^2$ perturbations. Extensive numerical experiments on constrained global optimization, sparse representation learning, and clustered federated learning illustrate CB2O’s practicality, efficiency, and robustness, demonstrating its potential as a principled metaheuristic for challenging bilevel problems.” wrapped with math in $...$.
Abstract
Bi-level optimization problems, where one wishes to find the global minimizer of an upper-level objective function over the globally optimal solution set of a lower-level objective, arise in a variety of scenarios throughout science and engineering, machine learning, and artificial intelligence. In this paper, we propose and investigate, analytically and experimentally, consensus-based bi-level optimization (CB$^2$O), a multi-particle metaheuristic derivative-free optimization method designed to solve bi-level optimization problems when both objectives may be nonconvex. Our method leverages within the computation of the consensus point a carefully designed particle selection principle implemented through a suitable choice of a quantile on the level of the lower-level objective, together with a Laplace principle-type approximation w.r.t. the upper-level objective function, to ensure that the bi-level optimization problem is solved in an intrinsic manner. We give an existence proof of solutions to a corresponding mean-field dynamics, for which we first establish the stability of our consensus point w.r.t. a combination of Wasserstein and $L^2$ perturbations, and consecutively resort to PDE considerations extending the classical Picard iteration to construct a solution. For such solution, we provide a global convergence analysis in mean-field law showing that the solution of the associated nonlinear nonlocal Fokker-Planck equation converges exponentially fast to the unique solution of the bi-level optimization problem provided suitable choices of the hyperparameters. The practicability and efficiency of our CB$^2$O algorithm is demonstrated through extensive numerical experiments in the settings of constrained global optimization, sparse representation learning, and robust (clustered) federated learning.
