Table of Contents
Fetching ...

Randomness as Reference: Benchmark Metric for Optimization in Engineering

Stefan Ivić, Siniša Družeta, Luka Grbčić

TL;DR

A novel performance metric is introduced, which employs random sampling as a statistical reference, providing nonlinear normalization of objective values and enabling unbiased comparison of algorithmic efficiency across heterogeneous problems, thereby narrowing the gap between the available benchmark tests and realistic engineering applications.

Abstract

Benchmarking optimization algorithms is fundamental for the advancement of computational intelligence. However, widely adopted artificial test suites exhibit limited correspondence with the diversity and complexity of real-world engineering optimization tasks. This paper presents a new benchmark suite comprising 231 bounded, continuous, unconstrained optimization problems, the majority derived from engineering design and simulation scenarios, including computational fluid dynamics and finite element analysis models. In conjunction with this suite, a novel performance metric is introduced, which employs random sampling as a statistical reference, providing nonlinear normalization of objective values and enabling unbiased comparison of algorithmic efficiency across heterogeneous problems. Using this framework, 20 deterministic and stochastic optimization methods were systematically evaluated through hundreds of independent runs per problem, ensuring statistical robustness. The results indicate that only a few of the tested optimization methods consistently achieve excellent performance, while several commonly used metaheuristics exhibit severe efficiency loss on engineering-type problems, emphasizing the limitations of conventional benchmarks. Furthermore, the conducted tests are used for analyzing various features of the optimization methods, providing practical guidelines for their application. The proposed test suite and metric together offer a transparent, reproducible, and practically relevant platform for evaluating and comparing optimization methods, thereby narrowing the gap between the available benchmark tests and realistic engineering applications.

Randomness as Reference: Benchmark Metric for Optimization in Engineering

TL;DR

A novel performance metric is introduced, which employs random sampling as a statistical reference, providing nonlinear normalization of objective values and enabling unbiased comparison of algorithmic efficiency across heterogeneous problems, thereby narrowing the gap between the available benchmark tests and realistic engineering applications.

Abstract

Benchmarking optimization algorithms is fundamental for the advancement of computational intelligence. However, widely adopted artificial test suites exhibit limited correspondence with the diversity and complexity of real-world engineering optimization tasks. This paper presents a new benchmark suite comprising 231 bounded, continuous, unconstrained optimization problems, the majority derived from engineering design and simulation scenarios, including computational fluid dynamics and finite element analysis models. In conjunction with this suite, a novel performance metric is introduced, which employs random sampling as a statistical reference, providing nonlinear normalization of objective values and enabling unbiased comparison of algorithmic efficiency across heterogeneous problems. Using this framework, 20 deterministic and stochastic optimization methods were systematically evaluated through hundreds of independent runs per problem, ensuring statistical robustness. The results indicate that only a few of the tested optimization methods consistently achieve excellent performance, while several commonly used metaheuristics exhibit severe efficiency loss on engineering-type problems, emphasizing the limitations of conventional benchmarks. Furthermore, the conducted tests are used for analyzing various features of the optimization methods, providing practical guidelines for their application. The proposed test suite and metric together offer a transparent, reproducible, and practically relevant platform for evaluating and comparing optimization methods, thereby narrowing the gap between the available benchmark tests and realistic engineering applications.

Paper Structure

This paper contains 22 sections, 19 equations, 15 figures, 3 tables.

Figures (15)

  • Figure 1: Top plot shows an example of $\mathbb{G}$ mapping and the possible relations of the normalized logarithmic grade $\mathbb{G}$ to the normalized linear grade $\rho(f)$. Bottom plot indicates the sensitivity of $\alpha$, and consequently $\mathbb{G}$, to limiting values of $\rho$, which happens when $f^\circ$ gets too close to $f^-$ or $f^+$.
  • Figure 2: Distribution of the IndagoBench25 test functions. Function multimodality $M$ is assessed as per \ref{['eq:multimodality']}. Point size is defined by average $\mathbb{G}$ across all optimization methods (larger circle means harder problem, i.e. lower $\mathbb{G}$).
  • Figure 3: Subplots in upper panel show function information, the convergence (A) and the distribution (B) of random sampling. The reference points ($f^+$, $f^\circ$ and $f^-$) (E) are used to determine $\mathbb{G}(f)$ mapping function (C) along with its parameter $\alpha$(D). Lower panel display a benchmark for a selected optimization method (SSA). Statistical convergence of multifold optimization runs is tracked in (F). Convergence of median $\mathbb{G}$ over evaluations is shown with magenta curve (H), while convergence of its relative value (when reference RS point $f^\circ$ for same number of evaluations is used) is shown with dark blue curve (G). The values of relative $\mathbb{G}$ for 10%, 50% and 100% of evaluations are shown in labels (I). The distribution of $\mathbb{G}_{100\%}$ is shown with gray bins (J). Vertical lines and labels (K) indicate median values for $\mathbb{G}_{10\%}$, $\mathbb{G}_{50\%}$ and $\mathbb{G}_{100\%}$. Black dashed line and its label (L) exhibit value of repeating weighted $\mathbb{G}_{RW}$ which is obtained from sampling points (N) indicated in $\mathbb{G}_{100\%}$ percentile graph (M). If best overall solution is found by the method, it is indicated by the red star in the percentile graph (O).
  • Figure 4: The convergence of selected optimization methods for shortest path problem SP_zigzag20_50D. Out of four median solutions, although achieving very high $\mathbb{G}$, none converged to global optima, as visually represented with distinctive difference in depicted paths patterns. Switching between local optima can be observed in steep jumps in convergence plots for all four methods.
  • Figure 5: The convergence of selected optimization methods for ergodic problem EC_phi_50D shows interesting potential of repeated runs of LBFGSB method. The two bottom-left images clearly depict the difference in $\mathbb{G}$ score as related to the realized trajectories. Spectral metric makes the problem even more challenging (direct metric based EC_phi+s_50D on the bottom-middle vs. spectral metric based EC_phi_50D on the bottom-left). Additionally, a contrasting EC problem EC_gaussian4_50D is shown in the bottom-right.
  • ...and 10 more figures