Faster Optimization Through Genetic Drift

Cella Florescu; Marc Kaufmann; Johannes Lengler; Ulysse Schaller

Faster Optimization Through Genetic Drift

Cella Florescu, Marc Kaufmann, Johannes Lengler, Ulysse Schaller

TL;DR

This paper analyzes the compact Genetic Algorithm (cGA) with hypothetical population size $K$ on Dynamic BinVal (DynBV), a hard dynamic linear benchmark, contrasting its performance with OneMax. It establishes a clear trade-off: the conservative regime (large step sizes) suffers a substantial slowdown on DynBV, yielding a lower bound of $oxed{ ext{Ω}(K ext{min}\{K,n ight floor)}$ iterations and, for $K= ext{ω}(n)$, a runtime of $oxed{ ext{ω}(n^2)}$, while an aggressive regime (small $K$) preserves a quasi-linear runtime, with $oxed{Oig(n ext{polylog}(n)ig)}$ on DynBV and a conjectured $oxed{O(n ext{log} n)}$ under standard boundaries. The analysis employs drift theorems, Chernoff bounds, and hypergeometric tails to characterize signal steps, sampling variance, and frequency dynamics, and is complemented by simulations that confirm the phase transition and practical speedups for small $K$. The results highlight that embracing genetic drift can dramatically accelerate optimization on hard dynamic functions, informing parameter choices for EDAs in drift-prone settings. Overall, the paper provides tight lower bounds for the conservative regime, a conservative upper bound with polynomial $K$, and nontrivial upper bounds for the aggressive regime, supported by empirical validation.

Abstract

The compact Genetic Algorithm (cGA), parameterized by its hypothetical population size $K$, offers a low-memory alternative to evolving a large offspring population of solutions. It evolves a probability distribution, biasing it towards promising samples. For the classical benchmark OneMax, the cGA has to two different modes of operation: a conservative one with small step sizes $Θ(1/(\sqrt{n}\log n))$, which is slow but prevents genetic drift, and an aggressive one with large step sizes $Θ(1/\log n)$, in which genetic drift leads to wrong decisions, but those are corrected efficiently. On OneMax, an easy hill-climbing problem, both modes lead to optimization times of $Θ(n\log n)$ and are thus equally efficient. In this paper we study how both regimes change when we replace OneMax by the harder hill-climbing problem DynamicBinVal. It turns out that the aggressive mode is not affected and still yields quasi-linear runtime $O(n\cdot polylog (n))$. However, the conservative mode becomes substantially slower, yielding a runtime of $Ω(n^2)$, since genetic drift can only be avoided with smaller step sizes of $O(1/n)$. We complement our theoretical results with simulations.

Faster Optimization Through Genetic Drift

TL;DR

This paper analyzes the compact Genetic Algorithm (cGA) with hypothetical population size

on Dynamic BinVal (DynBV), a hard dynamic linear benchmark, contrasting its performance with OneMax. It establishes a clear trade-off: the conservative regime (large step sizes) suffers a substantial slowdown on DynBV, yielding a lower bound of

iterations and, for

, a runtime of

, while an aggressive regime (small

) preserves a quasi-linear runtime, with

on DynBV and a conjectured

under standard boundaries. The analysis employs drift theorems, Chernoff bounds, and hypergeometric tails to characterize signal steps, sampling variance, and frequency dynamics, and is complemented by simulations that confirm the phase transition and practical speedups for small

. The results highlight that embracing genetic drift can dramatically accelerate optimization on hard dynamic functions, informing parameter choices for EDAs in drift-prone settings. Overall, the paper provides tight lower bounds for the conservative regime, a conservative upper bound with polynomial

, and nontrivial upper bounds for the aggressive regime, supported by empirical validation.

Abstract

The compact Genetic Algorithm (cGA), parameterized by its hypothetical population size

, offers a low-memory alternative to evolving a large offspring population of solutions. It evolves a probability distribution, biasing it towards promising samples. For the classical benchmark OneMax, the cGA has to two different modes of operation: a conservative one with small step sizes

, which is slow but prevents genetic drift, and an aggressive one with large step sizes

, in which genetic drift leads to wrong decisions, but those are corrected efficiently. On OneMax, an easy hill-climbing problem, both modes lead to optimization times of

and are thus equally efficient. In this paper we study how both regimes change when we replace OneMax by the harder hill-climbing problem DynamicBinVal. It turns out that the aggressive mode is not affected and still yields quasi-linear runtime

. However, the conservative mode becomes substantially slower, yielding a runtime of

, since genetic drift can only be avoided with smaller step sizes of

. We complement our theoretical results with simulations.

Paper Structure (24 sections, 24 theorems, 45 equations, 2 figures, 1 algorithm)

This paper contains 24 sections, 24 theorems, 45 equations, 2 figures, 1 algorithm.

Introduction
Our results
The conservative regime is slow.
The aggressive regime is fast.
Discussion of the setup and related work
Signal steps and DynBV.
Related work.
Overview of the paper
Setting
The Algorithm: the cGA with hypothetical population size $K$
The Benchmark: Dynamic BinVal
Terminology
Signal step.
Random step.
Sampling variance.
...and 9 more sections

Key Result

theorem thmcountertheorem

Let $\Bar{p}\in (0, \frac{1}{2})$ be arbitrary and consider the cGA with parameter $K =O(\mathrm{poly}(n))$ and boundaries at $\Bar{p}$ and $1-\Bar{p}$ on DynBV. Then with high probability, the optimum is not sampled during the first $\Omega(K \cdot \min\{K, n\})$ iterations.

Figures (2)

Figure 1: Number of iterations for the optimization of Dynamic BinVal with the cGA when $6\le K \le 10000$. The right plot shows the subinterval $18 \le K \le 90$. The median over 50 runs is plotted.
Figure 2: Number of bits that reach the lower boundary $1-\frac{1}{n}$ for the range $5 \le K \le 800$. The median over 20 runs is plotted.

Theorems & Definitions (39)

theorem thmcountertheorem
theorem thmcountertheorem
theorem thmcountertheorem
theorem thmcountertheorem
theorem thmcountertheorem: Theorem 2 in oliveto2015improved
theorem thmcountertheorem: Theorem 15 in lehre2013general
theorem thmcountertheorem: Theorem 1 in neumann2010few
lemma thmcounterlemma: skala2013hypergeometric
theorem thmcountertheorem: Chernoff Bound doerr2020probabilistic
proposition thmcounterproposition
...and 29 more

Faster Optimization Through Genetic Drift

TL;DR

Abstract

Faster Optimization Through Genetic Drift

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (39)