Pushing the Limits of the Reactive Affine Shaker Algorithm to Higher Dimensions
Roberto Battiti, Mauro Brunato
TL;DR
The paper tackles the challenge of optimizing expensive functions in very high dimensions by proposing the Reactive Affine Shaker (RAS), a simple local-search heuristic that relies on an anisotropic, affine-updated search box and only uses success/failure feedback rather than actual function values. RAS updates a box around the current point with random direction vectors, expanding along successful directions via an affine transform $A=oldsymbol{I} + (\\rho-1) \\frac{\\boldsymbol{\\Delta}\\boldsymbol{\\Delta}^T}{\\|\\boldsymbol{\\Delta}\\|^2}$ and contracting after failures, allowing the search to align with promising directions even in high dimensions. The authors conduct an ablation study and compare RAS against state-of-the-art high-dimensional Bayesian optimization methods (e.g., TuRBO, SaasBO, Alebo, HeSBO) and CMA-ES on diverse benchmarks (including BAxUS), showing that while RAS often matches or approaches BO performance, its strengths lie in simplicity, robustness, and insights into the geometry of high-dimensional searches. The work suggests that RAS can serve as a strong baseline or be hybridized with BO to balance exploration and exploitation, and it opens avenues for parallel/multi-start extensions (e.g., M-RAS) and extensions to mixed/combinatorial spaces.
Abstract
Bayesian Optimization (BO) for the minimization of expensive functions of continuous variables uses all the knowledge acquired from previous samples (${\boldsymbol x}_i$ and $f({\boldsymbol x}_i)$ values) to build a surrogate model based on Gaussian processes. The surrogate is then exploited to define the next point to sample, through a careful balance of exploration and exploitation. Initially intended for low-dimensional spaces, BO has recently been modified and used also for very large-dimensional spaces (up to about one thousand dimensions). In this paper we consider a much simpler algorithm, called "Reactive Affine Shaker" (RAS). The next sample is always generated with a uniform probability distribution inside a parallelepiped (the "box"). At each iteration, the form of the box is adapted during the search through an affine transformation, based only on the point $\boldsymbol x$ position and on the success or failure in improving the function. The function values are therefore not used directly to modify the search area and to generate the next sample. The entire dimensionality is kept (no active subspaces). Despite its extreme simplicity and its use of only stochastic local search, surprisingly the produced results are comparable to and not too far from the state-of-the-art results of high-dimensional versions of BO, although with some more function evaluations. An ablation study and an analysis of probability distribution of directions (improving steps and prevailing box orientation) in very large-dimensional spaces are conducted to understand more about the behavior of RAS and to assess the relative importance of the algorithmic building blocks for the final results.
