Mutation Strength Adaptation of the $(μ/μ_I, λ)$-ES for Large Population Sizes on the Sphere Function
Amir Omeradzic, Hans-Georg Beyer
TL;DR
This work investigates how mutation-strength adaptation in a $(\mu/\mu_I,\lambda)$-ES with isotropic mutations behaves under large population sizes on the sphere. It combines theoretical steady-state analyses with experiments to characterize the scale-invariant mutation strength $σ^*$ and a steady-state adaptation factor $γ$, across CSA parametrizations and σSA sampling schemes. Key findings show that only the $\sqrt{N}$ CSA variant maintains a roughly constant $γ$ while delivering strong progress, whereas other CSA variants slow adaptation as $N$ or $μ/N$ grow; σSA's performance highly depends on the learning parameter $τ$ and on whether log-normal or normal mutation sampling is used, with notable biases and stability differences. These results inform adaptive population-control strategies for ES in noisy or multimodal optimization settings.
Abstract
The mutation strength adaptation properties of a multi-recombinative $(μ/μ_I, λ)$-ES are studied for isotropic mutations. To this end, standard implementations of cumulative step-size adaptation (CSA) and mutative self-adaptation ($σ$SA) are investigated experimentally and theoretically by assuming large population sizes ($μ$) in relation to the search space dimensionality ($N$). The adaptation is characterized in terms of the scale-invariant mutation strength on the sphere in relation to its maximum achievable value for positive progress. %The results show how the different $σ$-adaptation variants behave as $μ$ and $N$ are varied. Standard CSA-variants show notably different adaptation properties and progress rates on the sphere, becoming slower or faster as $μ$ or $N$ are varied. This is shown by investigating common choices for the cumulation and damping parameters. Standard $σ$SA-variants (with default learning parameter settings) can achieve faster adaptation and larger progress rates compared to the CSA. However, it is shown how self-adaptation affects the progress rate levels negatively. Furthermore, differences regarding the adaptation and stability of $σ$SA with log-normal and normal mutation sampling are elaborated.
