Table of Contents
Fetching ...

Lazy Parameter Tuning and Control: Choosing All Parameters Randomly From a Power-Law Distribution

Denis Antipov, Maxim Buzdalov, Benjamin Doerr

TL;DR

A lazy but effective solution to choosing all parameter values in each iteration randomly from a suitably scaled power-law distribution is proposed, and a performance guarantee that is comparable to, and sometimes even better than, the best performance known for static parameters is proved.

Abstract

Most evolutionary algorithms have multiple parameters and their values drastically affect the performance. Due to the often complicated interplay of the parameters, setting these values right for a particular problem (parameter tuning) is a challenging task. This task becomes even more complicated when the optimal parameter values change significantly during the run of the algorithm since then a dynamic parameter choice (parameter control) is necessary. In this work, we propose a lazy but effective solution, namely choosing all parameter values (where this makes sense) in each iteration randomly from a suitably scaled power-law distribution. To demonstrate the effectiveness of this approach, we perform runtime analyses of the $(1+(λ,λ))$ genetic algorithm with all three parameters chosen in this manner. We show that this algorithm on the one hand can imitate simple hill-climbers like the $(1+1)$ EA, giving the same asymptotic runtime on problems like OneMax, LeadingOnes, or Minimum Spanning Tree. On the other hand, this algorithm is also very efficient on jump functions, where the best static parameters are very different from those necessary to optimize simple problems. We prove a performance guarantee that is comparable to the best performance known for static parameters. For the most interesting case that the jump size $k$ is constant, we prove that our performance is asymptotically better than what can be obtained with any static parameter choice. We complement our theoretical results with a rigorous empirical study confirming what the asymptotic runtime results suggest.

Lazy Parameter Tuning and Control: Choosing All Parameters Randomly From a Power-Law Distribution

TL;DR

A lazy but effective solution to choosing all parameter values in each iteration randomly from a suitably scaled power-law distribution is proposed, and a performance guarantee that is comparable to, and sometimes even better than, the best performance known for static parameters is proved.

Abstract

Most evolutionary algorithms have multiple parameters and their values drastically affect the performance. Due to the often complicated interplay of the parameters, setting these values right for a particular problem (parameter tuning) is a challenging task. This task becomes even more complicated when the optimal parameter values change significantly during the run of the algorithm since then a dynamic parameter choice (parameter control) is necessary. In this work, we propose a lazy but effective solution, namely choosing all parameter values (where this makes sense) in each iteration randomly from a suitably scaled power-law distribution. To demonstrate the effectiveness of this approach, we perform runtime analyses of the genetic algorithm with all three parameters chosen in this manner. We show that this algorithm on the one hand can imitate simple hill-climbers like the EA, giving the same asymptotic runtime on problems like OneMax, LeadingOnes, or Minimum Spanning Tree. On the other hand, this algorithm is also very efficient on jump functions, where the best static parameters are very different from those necessary to optimize simple problems. We prove a performance guarantee that is comparable to the best performance known for static parameters. For the most interesting case that the jump size is constant, we prove that our performance is asymptotically better than what can be obtained with any static parameter choice. We complement our theoretical results with a rigorous empirical study confirming what the asymptotic runtime results suggest.

Paper Structure

This paper contains 18 sections, 21 theorems, 93 equations, 7 figures, 3 tables, 1 algorithm.

Key Result

Lemma 1

For all positive integers $a$ and $b$ such that $b \ge a$ and for all $\beta > 0$, the sum $\sum_{i = a}^b i^{-\beta}$ is where $\Theta$ notation is used for $b \to +\infty$.

Figures (7)

  • Figure 1: Plot of the $\textsc{Jump}\xspace_k$ function. As a function of unitation, the function value of a search point $x$ depends only on the number of one-bits in $x$.
  • Figure 2: Illustration of the definition of the critical object.
  • Figure 3: Running times of the heavy-tailed ${(1 + (\lambda , \lambda))}$ GA on OneMax starting from a random point, normalized by $n \ln (n)$, for different $\beta_{pc} = \beta_p = \beta_c$ and $\beta_\lambda=2.8$ in relation to the problem size $n$. The expected running times of $(1 + 1)$ EA, also starting from a random point, are given for comparison.
  • Figure 4: Running times of the heavy-tailed ${(1 + (\lambda , \lambda))}$ GA on OneMax starting from a random point, for $n=2^{14}$ and different $\beta_{pc} = \beta_p = \beta_c$ depending on $\beta_\lambda$.
  • Figure 5: Running times of the heavy-tailed ${(1 + (\lambda , \lambda))}$ GA on Jump, depending on the problem size $n$, in comparison to the $(1 + 1)$ EA. Jump sizes are $k=3$ on the left and $k=5$ on the right.
  • ...and 2 more figures

Theorems & Definitions (33)

  • Lemma 1: Lemma 1 in AntipovD20ppsn
  • Lemma 2: Lemma 2 in AntipovD20ppsn
  • Lemma 3: Lemma 3 in AntipovD20ppsn
  • Lemma 4
  • Lemma 5: Wald's equation
  • Theorem 6: Multiplicative Drift DoerrJW12algo
  • Lemma 7
  • Theorem 8
  • Lemma 9
  • proof
  • ...and 23 more