Tail Bounds on the Runtime of Categorical Compact Genetic Algorithm
Ryoki Hamano, Kento Uchida, Shinichi Shirakawa, Daiki Morinaga, Youhei Akimoto
TL;DR
This paper develops a theory for the runtime of a categorical compact genetic algorithm (ccGA) with sample size two, extending the binary cGA to the categorical domain. It establishes tight tail bounds for runtimes on two representative linear objectives, COM and KVal, under explicit learning-rate regimes that depend on the problem size $D$ and the number of categories $K$, and introduces two novel drift-analytic tools: conditional drift theorems and a drift theorem for skipping processes. The main results show that the runtime on COM is, with high probability, $O(rac{\sqrt{D}\ln(DK)}{η})$ and lower-bounded by $Ω(rac{\sqrt{D}+\ln K}{η})$, while the runtime on KVal is $Θ(\frac{D \ln K}{η})$ under suitable $η$, highlighting how the category count $K$ influences search efficiency differently across linear functions. The paper also provides general drift-theorem extensions, connects to the information-geometric optimization framework, and supports the theory with experiments, offering guidance on choosing learning rates for categorical optimization problems. The results advance understanding of EDAs on categorical domains and inform practical parameter settings for scalable discrete optimization tasks.
Abstract
The majority of theoretical analyses of evolutionary algorithms in the discrete domain focus on binary optimization algorithms, even though black-box optimization on the categorical domain has a lot of practical applications. In this paper, we consider a probabilistic model-based algorithm using the family of categorical distributions as its underlying distribution and set the sample size as two. We term this specific algorithm the categorical compact genetic algorithm (ccGA). The ccGA can be considered as an extension of the compact genetic algorithm (cGA), which is an efficient binary optimization algorithm. We theoretically analyze the dependency of the number of possible categories $K$, the number of dimensions $D$, and the learning rate $η$ on the runtime. We investigate the tail bound of the runtime on two typical linear functions on the categorical domain: categorical OneMax (COM) and KVal. We derive that the runtimes on COM and KVal are $O(\sqrt{D} \ln (DK) / η)$ and $Θ(D \ln K/ η)$ with high probability, respectively. Our analysis is a generalization for that of the cGA on the binary domain.
