Table of Contents
Fetching ...

Plus Strategies are Exponentially Slower for Planted Optima of Random Height

Johannes Lengler, Leon Schiller, Oliver Sieberling

TL;DR

This work analyzes two evolutionary-algorithm selection schemes on a rugged benchmark, DisOM, where a small fraction of points acquire random height distortions drawn from a distribution $\mathcal{D}$. The authors prove that the plus strategy $(1+\lambda)$-EA becomes super-polynomially slow under broad tail conditions on $\mathcal{D}$ (e.g., exponential and Gaussian tails), while the comma strategy $(1,\lambda)$-EA remains efficient at achieving a fixed target $n - k^*$. The core technique combines frozen-noise modeling, uniform exploration arguments, and a distortion-kickstart analysis to show that once near the optimum, plus selection is likely to encounter progressively larger distortions that trap progress; in contrast, comma selection continues to escape local optima due to its non-elitist nature. The results extend prior work by analyzing random-height distortions rather than constant shifts, with experiments validating the theoretical predictions. Overall, the paper highlights the fragility of elitist strategies on rugged landscapes with sparsely planted local optima and demonstrates the robustness of non-elitist approaches under realistic distortion models.

Abstract

We compare the $(1,λ)$-EA and the $(1 + λ)$-EA on the recently introduced benchmark DisOM, which is the OneMax function with randomly planted local optima. Previous work showed that if all local optima have the same relative height, then the plus strategy never loses more than a factor $O(n\log n)$ compared to the comma strategy. Here we show that even small random fluctuations in the heights of the local optima have a devastating effect for the plus strategy and lead to super-polynomial runtimes. On the other hand, due to their ability to escape local optima, comma strategies are unaffected by the height of the local optima and remain efficient. Our results hold for a broad class of possible distortions and show that the plus strategy, but not the comma strategy, is generally deceived by sparse unstructured fluctuations of a smooth landscape.

Plus Strategies are Exponentially Slower for Planted Optima of Random Height

TL;DR

This work analyzes two evolutionary-algorithm selection schemes on a rugged benchmark, DisOM, where a small fraction of points acquire random height distortions drawn from a distribution . The authors prove that the plus strategy -EA becomes super-polynomially slow under broad tail conditions on (e.g., exponential and Gaussian tails), while the comma strategy -EA remains efficient at achieving a fixed target . The core technique combines frozen-noise modeling, uniform exploration arguments, and a distortion-kickstart analysis to show that once near the optimum, plus selection is likely to encounter progressively larger distortions that trap progress; in contrast, comma selection continues to escape local optima due to its non-elitist nature. The results extend prior work by analyzing random-height distortions rather than constant shifts, with experiments validating the theoretical predictions. Overall, the paper highlights the fragility of elitist strategies on rugged landscapes with sparsely planted local optima and demonstrates the robustness of non-elitist approaches under realistic distortion models.

Abstract

We compare the -EA and the -EA on the recently introduced benchmark DisOM, which is the OneMax function with randomly planted local optima. Previous work showed that if all local optima have the same relative height, then the plus strategy never loses more than a factor compared to the comma strategy. Here we show that even small random fluctuations in the heights of the local optima have a devastating effect for the plus strategy and lead to super-polynomial runtimes. On the other hand, due to their ability to escape local optima, comma strategies are unaffected by the height of the local optima and remain efficient. Our results hold for a broad class of possible distortions and show that the plus strategy, but not the comma strategy, is generally deceived by sparse unstructured fluctuations of a smooth landscape.
Paper Structure (31 sections, 27 theorems, 44 equations, 5 figures, 2 algorithms)

This paper contains 31 sections, 27 theorems, 44 equations, 5 figures, 2 algorithms.

Key Result

Theorem 2.1

Let $\varepsilon > 0$ be a sufficiently small constant. Assume further that $\mathcal{D}$ satisfies ass:distribution, and that $p$ is such that ass:oleaineff is met. Then with high probability, the $(1 + \lambda)$-EA on $\textsc{DisOM}_{\mathcal{D}}$ either spends $\mathcal{T} = \exp(n^{\varepsilon/

Figures (5)

  • Figure 1: Illustration for the proof of \ref{['lem:onlylogninfitnesslayers']}. The circles represent all distorted points in $\textsc{Ol}(k)$, the red ones are the ones visited by the algorithm. Every time, we sample a new distorted point $y$ in $\textsc{Ol}(k)$, we cut the set of points with distortion larger than $y$ in half (in expectation). Hence, only $\mathcal{O}\mkern-1mu(\log(|\textsc{Ol}(k)|))$ distorted points in $\textsc{Ol}(k)$ are visited w.h.p.
  • Figure 2: Illustration for the proof of \ref{['lem:onlyfewoftheneighborhoodsiuncovered']}. We generate offspring from parent $y$ with $\textsc{HD}( x, y ) = \ell$ in two steps: First decide which $k \sim \text{Bin}(\ell, 1/n)$ bits to flip among the bits that differ between $x$ and $y$ to create the intermediate offspring $z^{(1)}$; then decide which bits to flip among the remaining bits to create the final offspring $z^{(2)}$.
  • Figure 3: Total fitness, OneMax-fitness, and distortion of one run of the $(1 + \lambda)$-EA and the $(1 , \lambda)$-EA on $\textsc{DisOM}_{\mathcal{D}}$ with $\mathcal{D}$ being an exponential distribution with rate parameter $0.4$. We use $n = 150$, $\lambda = 8$, $p = 0.0245$ and $k^* = 2.12$ with a cutoff of $10^9$ generations. The x-axis is scaled logarithmically.
  • Figure 4: Number of generations required by the $(1 + \lambda)$-EA and the $(1 , \lambda)$-EA to optimize $\textsc{DisOM}_{\mathcal{D}}$ with $\mathcal{D}$ being an exponential distribution with rate parameter $0.4$ and a uniform distribution from $0$ to $4$. We take the median over $49$ runs. We set $\lambda = \lfloor 1.5 \log{n} \rceil$, $p = 0.3n^{-0.5}$ and $k^* = n^{0.15}$ with a cutoff of $10^6$ generations. The y-axis is scaled logarithmically.
  • Figure 5: Normalized number of generations required by the $(1 + \lambda)$-EA to optimize $\textsc{DisOM}_{\mathcal{D}}$ for different distortion probabilities $p$. The distortions are sampled from an exponential distribution with rate parameter $0.4$ truncated at different cutoffs $d$. We set $n=300$, $\lambda = 9$, $k^* = 2.35$ and average over $49$ runs. Note that the $y$-axis shows the (averaged) run time re-scaled by a factor of $(p\text{Pr}\left[ D \ge d \right])^{-1}$. This gives evidence that the run time is indeed proportional to $(p\text{Pr}\left[ D \ge d \right])^{-1}$ as predicted by \ref{['thm:olealowerbound']}.

Theorems & Definitions (28)

  • Theorem 2.1
  • Theorem 2.2: Lower Bound
  • Remark 1
  • Corollary 2.1
  • Corollary 2.2
  • Corollary 2.3
  • Theorem 2.3: Upper Bound
  • lemma 1
  • lemma 2
  • theorem 1: Dubhashi_Panconesi_2009
  • ...and 18 more