Table of Contents
Fetching ...

Swarm-Based Inertial Methods for Optimization

Qiyu Wu, Kunhui Luan, Qi Wang

Abstract

We introduce a new class of swarm-based inertial methods (SBIMs) for global minimization, formulated as coupled dissipative inertial dynamical systems derived from the generalized Onsager principle. The proposed framework identifies the friction operator and the scaling of the potential energy, namely the objective function to be minimized, as the key ingredients governing relaxation dynamics over the energy landscape. Within this framework, we propose a new underdamped inertial dynamics whose damping mechanisms incorporate both gradient and Hessian information, allowing the system to adjust damping or acceleration according to the agent trajectories and the curvature of the landscape. Under suitable conditions, we prove that the underdamped system satisfies an energy dissipation law, from which we establish an upper bound on the asymptotic decay rate of the gap between the objective function and its global minimum, given by $O(1/δ(t))$ (defined in §3). We further construct structure-preserving discretizations that retain both discrete energy dissipation and the convergence rate estimate, $O(1/δ_k)$ (defined in \S3). In addition, we present several other efficient numerical algorithms for the dynamical system. Numerical experiments for all proposed algorithms validate the theory on convex test problems and demonstrate convergence rates in function values that are substantially faster than the theoretical guarantees ($O(1/δ_k)$). On nonconvex benchmark problems, the proposed methods achieve high success rates in reaching the global minimum, and exhibit more stable energy decay than swarm-based gradient descent and Nesterov methods. Overall, this work provides a systematic framework for the construction and analysis of SBIMs from an energy-dissipative perspective.

Swarm-Based Inertial Methods for Optimization

Abstract

We introduce a new class of swarm-based inertial methods (SBIMs) for global minimization, formulated as coupled dissipative inertial dynamical systems derived from the generalized Onsager principle. The proposed framework identifies the friction operator and the scaling of the potential energy, namely the objective function to be minimized, as the key ingredients governing relaxation dynamics over the energy landscape. Within this framework, we propose a new underdamped inertial dynamics whose damping mechanisms incorporate both gradient and Hessian information, allowing the system to adjust damping or acceleration according to the agent trajectories and the curvature of the landscape. Under suitable conditions, we prove that the underdamped system satisfies an energy dissipation law, from which we establish an upper bound on the asymptotic decay rate of the gap between the objective function and its global minimum, given by (defined in §3). We further construct structure-preserving discretizations that retain both discrete energy dissipation and the convergence rate estimate, (defined in \S3). In addition, we present several other efficient numerical algorithms for the dynamical system. Numerical experiments for all proposed algorithms validate the theory on convex test problems and demonstrate convergence rates in function values that are substantially faster than the theoretical guarantees (). On nonconvex benchmark problems, the proposed methods achieve high success rates in reaching the global minimum, and exhibit more stable energy decay than swarm-based gradient descent and Nesterov methods. Overall, this work provides a systematic framework for the construction and analysis of SBIMs from an energy-dissipative perspective.

Paper Structure

This paper contains 34 sections, 8 theorems, 149 equations, 11 figures, 27 tables, 4 algorithms.

Key Result

Theorem 2.1

Let $\epsilon = 0$ in Algorithm classical:Nesterov Algorithm. Assume for all $i$, $F \in C^1$, $m_i^{n} > 0$ for all $n \in \mathbb{N}$, and the updates of $R_i^{n}$ and $a_i^{n}$ follow riai:Nes. Then, discrete energy in Algorithm classical:Nesterov Algorithm satisfies for all $n \in \mathbb{N}$ and all $i \neq i_n^-$, neglecting higher-order terms of order $\mathcal{O}(\Delta t^2)$. $\blacktria

Figures (11)

  • Figure 1: Comparison of $F-F^{*}$ across methods for the 10D Rotated Hyper-Ellipsoid Function ($h=1/16$). Each subplot is one method (left-to-right, top-to-bottom). The potential energy decays monotonically during the iteration for all methods except for the Nesterov method, which exhibits some slight oscillations. The GD method converges in two steps to the minimum value of $F(x^*)$.
  • Figure 2: Local convergence rate exponent $p_k$ across methods (10D Rotated Hyper-Ellipsoid Function, $h = 1/16$). The first iteration is initialized by $x_1 = x_0 - 10^{-4} \nabla F(x_0)$, and the subsequent iterations follow the corresponding numerical schemes. All inertial methods demonstrate a uniform plateau except for the Nesterov method, which shows violent oscillations during the iteration.
  • Figure 3: Comparison of $E^k$ across methods for the 10D Rotated Hyper-Ellipsoid Function ($h=1/16$). Each subplot is one method (left-to-right, top-to-bottom). The discrete energy oscillates wildly when using the Nesterov method.
  • Figure 4: 1D test functions ($B=0$).
  • Figure 5: Comparison of $\|v_k\|_2$ across methods for the 10D Rotated Hyper-Ellipsoid Function ($h=1/16$). Each subplot is one method (left-to-right, top-to-bottom).
  • ...and 6 more figures

Theorems & Definitions (19)

  • Definition 2.1
  • Theorem 2.1: Discrete Energy Dissipation of the SBIM--Nesterov Scheme
  • Remark 2.1
  • Theorem 3.1
  • proof
  • Remark 3.1
  • Theorem 3.2
  • proof
  • Theorem 3.3
  • Theorem 3.4
  • ...and 9 more