Estimation-of-Distribution Algorithms for Multi-Valued Decision Variables

Firas Ben Jedidia; Benjamin Doerr; Martin S. Krejca

Estimation-of-Distribution Algorithms for Multi-Valued Decision Variables

Firas Ben Jedidia, Benjamin Doerr, Martin S. Krejca

TL;DR

This work extends univariate EDAs to multi-valued decision variables by introducing a frequency-matrix framework with per-row sampling and border-based restrictions. Through a McDiarmid-based concentration analysis, it shows that genetic drift scales with the number of values $r$, requiring parameter choices that slow updates by a factor of about $r$ to maintain reliability. The authors analyze the runtime of the $r$-UMDA on the $r$-Leading-Ones benchmark, proving an upper bound of $O(r\ln(r)^2 n^2 \ln(n))$ and a matching lower bound of $\Omega(r\ln(r) n^2 \ln(n))$, up to logarithmic factors, thereby extending binary EDAs’ theory to the multi-valued setting. These results demonstrate that the rich understanding of binary EDAs naturally extends to multi-valued problems and offer concrete guidance for parameter settings in practice, potentially broadening EDA applications beyond permutation and binary domains.

Abstract

The majority of research on estimation-of-distribution algorithms (EDAs) concentrates on pseudo-Boolean optimization and permutation problems, leaving the domain of EDAs for problems in which the decision variables can take more than two values, but which are not permutation problems, mostly unexplored. To render this domain more accessible, we propose a natural way to extend the known univariate EDAs to this setting. Different from a naive reduction to the binary case, our approach avoids additional constraints. Since understanding genetic drift is crucial for an optimal parameter choice, we extend the known quantitative analysis of genetic drift to EDAs for multi-valued variables. Roughly speaking, when the variables take $r$ different values, the time for genetic drift to become significant is $r$ times shorter than in the binary case. Consequently, the update strength of the probabilistic model has to be chosen $r$ times lower now. To investigate how desired model updates take place in this framework, we undertake a mathematical runtime analysis on the $r$-valued \leadingones problem. We prove that with the right parameters, the multi-valued UMDA solves this problem efficiently in $O(r\ln(r)^2 n^2 \ln(n))$ function evaluations. This bound is nearly tight as our lower bound $Ω(r\ln(r) n^2 \ln(n))$ shows. Overall, our work shows that our good understanding of binary EDAs naturally extends to the multi-valued setting, and it gives advice on how to set the main parameters of multi-values EDAs.

Estimation-of-Distribution Algorithms for Multi-Valued Decision Variables

TL;DR

, requiring parameter choices that slow updates by a factor of about

to maintain reliability. The authors analyze the runtime of the

-UMDA on the

-Leading-Ones benchmark, proving an upper bound of

and a matching lower bound of

, up to logarithmic factors, thereby extending binary EDAs’ theory to the multi-valued setting. These results demonstrate that the rich understanding of binary EDAs naturally extends to multi-valued problems and offer concrete guidance for parameter settings in practice, potentially broadening EDA applications beyond permutation and binary domains.

Abstract

different values, the time for genetic drift to become significant is

times shorter than in the binary case. Consequently, the update strength of the probabilistic model has to be chosen

times lower now. To investigate how desired model updates take place in this framework, we undertake a mathematical runtime analysis on the

-valued \leadingones problem. We prove that with the right parameters, the multi-valued UMDA solves this problem efficiently in

function evaluations. This bound is nearly tight as our lower bound

shows. Overall, our work shows that our good understanding of binary EDAs naturally extends to the multi-valued setting, and it gives advice on how to set the main parameters of multi-values EDAs.

Paper Structure (18 sections, 13 theorems, 24 equations, 4 algorithms)

This paper contains 18 sections, 13 theorems, 24 equations, 4 algorithms.

Introduction
Related Work
Preliminaries
Multi-Valued EDAs
Binary EDAs
The Multi-Valued EDA Framework
Genetic Drift
Introduction to Genetic Drift
Martingale Property of Neutral Positions
Upper Bound on the Genetic-Drift Effect of a Neutral Position
Upper Bound for Positions with Weak Preference
Runtime Analysis of the r-UMDA
Previous Runtime Analyses of EDAs on LeadingOnes
The r-LeadingOnes Benchmark
Runtime Results
...and 3 more sections

Key Result

Lemma 1

Let $f$ be an $r$-valued position, and let $i \in [n]$ be a neutral position of $f$. Consider the $r$/̄UMDA without margins optimizing $f$. For each $j \in \mathopen{}\mathclose{\left[0..r-1\right]$, the frequencies $(p_{i,j}}^{(t)})_{t\in \mathbb{N}}$ are a martingale.

Theorems & Definitions (24)

Lemma 1
proof
Theorem 2: Martingale concentration result based on the variance McDiarmid98
Theorem 3
proof
Theorem 4
proof
Theorem 5
proof
Theorem 6
...and 14 more

Estimation-of-Distribution Algorithms for Multi-Valued Decision Variables

TL;DR

Abstract

Estimation-of-Distribution Algorithms for Multi-Valued Decision Variables

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (24)