Table of Contents
Fetching ...

ZEUS: An Efficient GPU Optimization Method Integrating PSO, BFGS, and Automatic Differentiation

Dominik Soos, Marc Paterno, Desh Ranjan, Mohammad Zubair

Abstract

We introduce a novel, efficient computational method, ZEUS, for numerical optimization, and provide an open-source implementation. It has four key ingredients: (1) particle swarm optimization (PSO), (2) the use of the Broyden-Fletcher-Goldfarb-Shanno (BFGS) method, (3) automatic differentiation (AD), and (4) GPUs. Our approach addresses the computational challenges inherent in high-dimensional, non-convex optimization problems. In the first phase of the algorithm, we get a potentially good set of starting points using PSO. Thereafter, we run BFGS independently in parallel from these starting points. BFGS is one of the best-performing algorithms for numerical optimization. However, it requires the gradient of the function being optimized. ZEUS integrates automatic differentiation into BFGS thus avoiding the need for the user to calculate derivatives explicitly. The use of GPUs allows ZEUS to speed up the calculations substantially. We carry out systematic studies to explore the trade-offs between the number of PSO iterations taken, starting points, and BFGS iteration depth. We show that a handful of iterations of PSO can improve global convergence when combined with BFGS. We also present performance studies using common test functions. The source code can be found at https://github.com/fnal-numerics/global-optimizer-gpu.

ZEUS: An Efficient GPU Optimization Method Integrating PSO, BFGS, and Automatic Differentiation

Abstract

We introduce a novel, efficient computational method, ZEUS, for numerical optimization, and provide an open-source implementation. It has four key ingredients: (1) particle swarm optimization (PSO), (2) the use of the Broyden-Fletcher-Goldfarb-Shanno (BFGS) method, (3) automatic differentiation (AD), and (4) GPUs. Our approach addresses the computational challenges inherent in high-dimensional, non-convex optimization problems. In the first phase of the algorithm, we get a potentially good set of starting points using PSO. Thereafter, we run BFGS independently in parallel from these starting points. BFGS is one of the best-performing algorithms for numerical optimization. However, it requires the gradient of the function being optimized. ZEUS integrates automatic differentiation into BFGS thus avoiding the need for the user to calculate derivatives explicitly. The use of GPUs allows ZEUS to speed up the calculations substantially. We carry out systematic studies to explore the trade-offs between the number of PSO iterations taken, starting points, and BFGS iteration depth. We show that a handful of iterations of PSO can improve global convergence when combined with BFGS. We also present performance studies using common test functions. The source code can be found at https://github.com/fnal-numerics/global-optimizer-gpu.

Paper Structure

This paper contains 33 sections, 8 equations, 6 figures, 4 algorithms.

Figures (6)

  • Figure 1: Box and whisker plot showing performance degrades drastically for the Rastrigin function as the dimensionality of the problem increases when using the same number of particles. For each dimension, we plot 100 runs, where each run is a result of using $10^5$ particles and 5 PSO iterations. We count the number of correct solutions across each run. $N_{\mathrm{correct}}$ corresponds to the count of each optimizations where the Euclidean error is less than 0.5.
  • Figure 2: Visual illustration of the speed advantage achieved by Zeus for 2-dimensional and 5-dimensional objective functions. CPU runtimes were divided by the number of cores to approximate the ideal parallel execution. The distributions are based on 100 runs. Vertical jitter was applied to each point to make them more visible for Zeus. The Ackley function was left out due to its misbehavior shown in Figure \ref{['fig:ackley']}.
  • Figure 3: Performance plots in terms of time (top) and number of correct solutions (bottom) across 100 runs compared with the PSO iterations for 5-dimensional Rastrigin (orange) and Rosenbrock (blue) functions. The Rastrigin function in this dimension has $11^{5}$ or $161,051$ local minima. $N_{\mathrm{correct}}$ corresponding to the count of each optimization where the Euclidean error is less than 0.5.
  • Figure 4: Comparative performances for the 10-dimensional Rastrigin function with $11^{10}$ local minima. For a 10-dimensional Rastrigin to have 1 for the Euclidean error, means the point landed in a local minima, not the local minima.
  • Figure 5: Simulated dijet mass spectrum and fitted dijet mass spectra. The top panel shows the simulated event counts (black points) compared with the fitted prediction (red line). The bottom panel plots the pull distribution (blue points), defined as $\frac{N_{\text{obs}} - N_{\text{pred}}}{\sigma}$, where $N_{\text{obs}}$ is the simulated count of events, $N_{\text{pred}}$ is the model prediction, and $\sigma$ is the statistical uncertainty per bin. The pulls fluctuate around zero and lie mostly within $\pm2\sigma$, indicating agreement between simulation and prediction.
  • ...and 1 more figures