Table of Contents
Fetching ...

Consensus-based algorithms for stochastic optimization problems

Sabrina Bonandin, Michael Herty

TL;DR

This work addresses static stochastic optimization by minimizing $f(x)=\mathbb{E}[F(x,\mathbf{Y})]$ using two consensus-based approaches: a Monte Carlo-based Sample Average Approximation (SAA) and a quadrature-based discretization. Each approach replaces the true objective with a tractable surrogate and uses a consensus-based particle dynamics (CBO) to find minimizers; the authors derive mean-field equations in the limits $N\to\infty$ and analyze the connections between the two formulations, including $M\to\infty$ and joint $N$–$M$ limits. They prove convergence results in $2$-Wasserstein distance and consensus-point norm, and provide extensive numerical experiments that validate the predicted rates: the SAA error decays as $O(M^{-1/2})$ and the quadrature error scales with dimension $k$ (with an overall rate near $O(N^{-1/2})$ in favorable cases). The findings clarify when MC-based or quadrature-based surrogates are preferable, quantify the impact of the random space dimension on convergence, and point to future enhancements such as variance reduction and variable-sample strategies for stochastic optimization.

Abstract

We address an optimization problem where the cost function is the expectation of a random mapping. To tackle the problem two approaches based on the approximation of the objective function by consensus-based particle optimization methods on the search space are developed. The resulting methods are mathematically analyzed using a mean-field approximation and their connection is established. Several numerical experiments show the validity of the proposed algorithms and investigate their rates of convergence.

Consensus-based algorithms for stochastic optimization problems

TL;DR

This work addresses static stochastic optimization by minimizing using two consensus-based approaches: a Monte Carlo-based Sample Average Approximation (SAA) and a quadrature-based discretization. Each approach replaces the true objective with a tractable surrogate and uses a consensus-based particle dynamics (CBO) to find minimizers; the authors derive mean-field equations in the limits and analyze the connections between the two formulations, including and joint limits. They prove convergence results in -Wasserstein distance and consensus-point norm, and provide extensive numerical experiments that validate the predicted rates: the SAA error decays as and the quadrature error scales with dimension (with an overall rate near in favorable cases). The findings clarify when MC-based or quadrature-based surrogates are preferable, quantify the impact of the random space dimension on convergence, and point to future enhancements such as variance reduction and variable-sample strategies for stochastic optimization.

Abstract

We address an optimization problem where the cost function is the expectation of a random mapping. To tackle the problem two approaches based on the approximation of the objective function by consensus-based particle optimization methods on the search space are developed. The resulting methods are mathematically analyzed using a mean-field approximation and their connection is established. Several numerical experiments show the validity of the proposed algorithms and investigate their rates of convergence.
Paper Structure (12 sections, 5 theorems, 61 equations, 9 figures, 5 tables)

This paper contains 12 sections, 5 theorems, 61 equations, 9 figures, 5 tables.

Key Result

Proposition 2.2

Let $\mu_0 \in \mathcal{P}_4(\mathbb{R}^d)$ and let $F(\cdot,\mathbf{y}) \in \mathcal{C}(\mathbb{R}^d,\mathbb{R})$, for any $\mathbf{y} \in E$, satisfy Assumptions ass:well-posMF_F. Let $f, \hat{f}_M$ fulfill Assumption ass:tract and choose $\lambda,\sigma>0$ with $2\lambda>d\sigma^2$. Assume that t

Figures (9)

  • Figure 1: Diagram illustrating the limits derived in the manuscript for the SAA (outer loop) and quadrature (inner diagonal) approaches. The precise sense in which the convergences apply will be specified in Sections \ref{['sec:SAA']} and \ref{['sec:quadr']}. The asterisk (*) denotes the fact that additional variables are sent to infinity to obtain the vertical arrow in the diagram.
  • Figure 2: Derivation of mean-field formulations \ref{['eq2: mf eq complete for EF']} and \ref{['eq2: mf eq complete for fM']} in the limit of the number of agents $N$ going to infinity and for fixed $M \in \mathbb{N}$ (and $\overrightarrow{\mathbf{Y}}(\cdot)$), $\alpha>0$ and $t \in [0,T]$. The sense in which the convergence holds is the one that can be formally derived through the propagation of chaos assumption on the marginals and that can be made rigorous by proceeding as in huang2022mean. For the bottom arrow, the convergence holds for any fixed realization $\omega \in \Omega$.
  • Figure 3: Derivation of two relations between mean-field formulations \ref{['eq2: mf eq complete for EF']} and \ref{['eq2: mf eq complete for fM']} and w.p.1 and for $M,\alpha,t$ sufficiently large. The precise sense in which the convergence of the vertical arrow holds is given in Proposition \ref{['prop:convsol_MF']} and Theorem \ref{['th:convcons_MF']}. The sense in which the horizontal arrows holds is explained in Figure \ref{['diag:onlyN']}.
  • Figure 4: Derivation of the relations between CBO algorithm \ref{['eq3: complete cbo for ftildeN']} and mean-field formulations \ref{['eq2: mf eq complete for EF']} and \ref{['eq3: mf eq complete quad']}. The limit in $N$ holds under the propagation of chaos assumption on the marginals; the vertical arrow connecting the two mean-fields should be intended in the sense expressed by Remark \ref{['rem:equalityMF']}.
  • Figure 5: Plot of $F$\ref{['def: ackleystocmodR1F']} and $f$\ref{['def: ackleystocmodR1f']} for the choices $Y = 0.1,3,5$. In plot (a), $x \in [-6,6]$, in plot(b), $x \in [-0.5,0.5]$, in plot(c), $x \in [-15,15]$. Plot (b) aims at highlighting the non-differentiability of $f(\cdot)$ and $F(\cdot,Y)$, for any $Y \in \mathbb{R}$, at $x=0$: we remark that the lost of regularity doesn't influence our CBO algorithms and theoretical results, as they all hold in a non-differentiability setting. Plot (c) shows that, for $x$ sufficiently large, $x \mapsto F(x,Y)$ is independent of $Y$.
  • ...and 4 more figures

Theorems & Definitions (17)

  • Remark 2.1
  • Proposition 2.2
  • proof : Proof of Proposition \ref{['prop:convsol_MF']}
  • Remark 2.3
  • Lemma 2.1: arnold1974stochastic for $\mu_0 \in \mathcal{P}_4(\mathbb{R}^d)$
  • Lemma 2.2: carrillo2018analytical
  • Lemma 2.3
  • proof : Proof of Lemma \ref{['lem:convcons_MF']}
  • Theorem 2.4
  • proof : Proof of Theorem \ref{['th:convcons_MF']}
  • ...and 7 more