Estimation-of-Distribution Algorithms for Multi-Valued Decision Variables
Firas Ben Jedidia, Benjamin Doerr, Martin S. Krejca
TL;DR
This work extends univariate EDAs to multi-valued decision variables by introducing a frequency-matrix framework with per-row sampling and border-based restrictions. Through a McDiarmid-based concentration analysis, it shows that genetic drift scales with the number of values $r$, requiring parameter choices that slow updates by a factor of about $r$ to maintain reliability. The authors analyze the runtime of the $r$-UMDA on the $r$-Leading-Ones benchmark, proving an upper bound of $O(r\ln(r)^2 n^2 \ln(n))$ and a matching lower bound of $\Omega(r\ln(r) n^2 \ln(n))$, up to logarithmic factors, thereby extending binary EDAs’ theory to the multi-valued setting. These results demonstrate that the rich understanding of binary EDAs naturally extends to multi-valued problems and offer concrete guidance for parameter settings in practice, potentially broadening EDA applications beyond permutation and binary domains.
Abstract
The majority of research on estimation-of-distribution algorithms (EDAs) concentrates on pseudo-Boolean optimization and permutation problems, leaving the domain of EDAs for problems in which the decision variables can take more than two values, but which are not permutation problems, mostly unexplored. To render this domain more accessible, we propose a natural way to extend the known univariate EDAs to this setting. Different from a naive reduction to the binary case, our approach avoids additional constraints. Since understanding genetic drift is crucial for an optimal parameter choice, we extend the known quantitative analysis of genetic drift to EDAs for multi-valued variables. Roughly speaking, when the variables take $r$ different values, the time for genetic drift to become significant is $r$ times shorter than in the binary case. Consequently, the update strength of the probabilistic model has to be chosen $r$ times lower now. To investigate how desired model updates take place in this framework, we undertake a mathematical runtime analysis on the $r$-valued \leadingones problem. We prove that with the right parameters, the multi-valued UMDA solves this problem efficiently in $O(r\ln(r)^2 n^2 \ln(n))$ function evaluations. This bound is nearly tight as our lower bound $Ω(r\ln(r) n^2 \ln(n))$ shows. Overall, our work shows that our good understanding of binary EDAs naturally extends to the multi-valued setting, and it gives advice on how to set the main parameters of multi-values EDAs.
