Hyper-Heuristics Can Profit From Global Variation Operators

Benjamin Doerr; Johannes F. Lutzeyer

Hyper-Heuristics Can Profit From Global Variation Operators

Benjamin Doerr, Johannes F. Lutzeyer

TL;DR

The paper rigorously analyzes the Move Acceptance Hyper-Heuristic (MAHH) on Cliff and Jump benchmarks. It proves a general, non-asymptotic lower bound for Jump with one-bit mutation, showing substantial slowdown when valley size $m=O(n^{1/2})$ is present. It then demonstrates that a global bit-wise mutation operator can yield an upper bound matching or surpassing the $(1+1)$-EA, providing the first rigorous speed-ups from combining a hyper-heuristic with a global mutation in Jump optimization. The results delineate the limits of MAHH on Jump while highlighting a promising hybrid approach that blends local search with global variation, suggesting productive directions for non-elitist hyper-heuristics. Overall, the work clarifies when MT+global mutation helps and when the Cliff-specific structure drives MAHH’s success, offering insights for future design of multimodal optimization heuristics.

Abstract

In recent work, Lissovoi, Oliveto, and Warwicker (Artificial Intelligence (2023)) proved that the Move Acceptance Hyper-Heuristic (MAHH) leaves the local optimum of the multimodal CLIFF benchmark with remarkable efficiency. The $O(n^3)$ runtime of the MAHH, for almost all cliff widths $d\ge 2,$ is significantly better than the $Θ(n^d)$ runtime of simple elitist evolutionary algorithms (EAs) on CLIFF. In this work, we first show that this advantage is specific to the CLIFF problem and does not extend to the JUMP benchmark, the most prominent multi-modal benchmark in the theory of randomized search heuristics. We prove that for any choice of the MAHH selection parameter $p$, the expected runtime of the MAHH on a JUMP function with gap size $m = O(n^{1/2})$ is at least $Ω(n^{2m-1} / (2m-1)!)$. This is significantly slower than the $O(n^m)$ runtime of simple elitist EAs. Encouragingly, we also show that replacing the local one-bit mutation operator in the MAHH with the global bit-wise mutation operator, commonly used in EAs, yields a runtime of $\min\{1, O(\frac{e\ln(n)}{m})^m\} \, O(n^m)$ on JUMP functions. This is at least as good as the runtime of simple elitist EAs. For larger values of $m$, this result proves an asymptotic performance gain over simple EAs. As our proofs reveal, the MAHH profits from its ability to walk through the valley of lower objective values in moderate-size steps, always accepting inferior solutions. This is the first time that such an optimization behavior is proven via mathematical means. Generally, our result shows that combining two ways of coping with local optima, global mutation and accepting inferior solutions, can lead to considerable performance gains.

Hyper-Heuristics Can Profit From Global Variation Operators

TL;DR

is present. It then demonstrates that a global bit-wise mutation operator can yield an upper bound matching or surpassing the

-EA, providing the first rigorous speed-ups from combining a hyper-heuristic with a global mutation in Jump optimization. The results delineate the limits of MAHH on Jump while highlighting a promising hybrid approach that blends local search with global variation, suggesting productive directions for non-elitist hyper-heuristics. Overall, the work clarifies when MT+global mutation helps and when the Cliff-specific structure drives MAHH’s success, offering insights for future design of multimodal optimization heuristics.

Abstract

runtime of the MAHH, for almost all cliff widths

is significantly better than the

runtime of simple elitist evolutionary algorithms (EAs) on CLIFF. In this work, we first show that this advantage is specific to the CLIFF problem and does not extend to the JUMP benchmark, the most prominent multi-modal benchmark in the theory of randomized search heuristics. We prove that for any choice of the MAHH selection parameter

, the expected runtime of the MAHH on a JUMP function with gap size

is at least

. This is significantly slower than the

runtime of simple elitist EAs. Encouragingly, we also show that replacing the local one-bit mutation operator in the MAHH with the global bit-wise mutation operator, commonly used in EAs, yields a runtime of

on JUMP functions. This is at least as good as the runtime of simple elitist EAs. For larger values of

, this result proves an asymptotic performance gain over simple EAs. As our proofs reveal, the MAHH profits from its ability to walk through the valley of lower objective values in moderate-size steps, always accepting inferior solutions. This is the first time that such an optimization behavior is proven via mathematical means. Generally, our result shows that combining two ways of coping with local optima, global mutation and accepting inferior solutions, can lead to considerable performance gains.

Paper Structure (12 sections, 17 theorems, 65 equations, 2 algorithms)

This paper contains 12 sections, 17 theorems, 65 equations, 2 algorithms.

Introduction
Related Work
Hyper-Heuristics
Runtime Analyses on Cliff and Jump Functions
Preliminaries
Algorithms
Benchmark Function Classes
Mathematical Tools
Lower Bounds on the Runtime of the MAHH with One-Bit Mutation
Upper Bound on the Runtime of the MAHH with Global Mutation
Upper Bound on the Runtime of the MAHH with One-Bit Mutation
Conclusion

Key Result

Theorem 1

Let $S \subseteq \mathbb{R}$ be a finite set of positive numbers with minimum $s_{\min}$. Let $(X_t)_{t\geq0}$ be a sequence of random variables over $S \cup \{0\}$. Let $T$ be the random variable that denotes the first point in time $t \in \mathbb{N}$ for which $X_t = 0$. Suppose further that there holds for all $s \in S$ with $\Pr\mathopen{}\mathclose{\left[X_t = s\right] > 0$. Then, for all $s_

Theorems & Definitions (30)

Theorem 1: Multiplicative Drift Theorem DoerrJW12algo
Theorem 2: Additive Drift Theorem HeY01
Theorem 3: Simplified Version of Wald's Equation Wald44DoerrK15
Theorem 4: Expected Duration of the Last Improvement
Lemma 5: DrosteJW00
proof : Proof of Theorem \ref{['thmformula']}
Corollary 6
proof
Theorem 7
Lemma 8: Drift in the slope towards the local optimum
...and 20 more

Hyper-Heuristics Can Profit From Global Variation Operators

TL;DR

Abstract

Hyper-Heuristics Can Profit From Global Variation Operators

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (30)