CatCMA : Stochastic Optimization for Mixed-Category Problems
Ryoki Hamano, Shota Saito, Masahiro Nomura, Kento Uchida, Shinichi Shirakawa
TL;DR
CatCMA tackles mixed-category black-box optimization by learning a joint search distribution over continuous and categorical variables, using information geometric optimization to update a coupled Gaussian–categorical model. It integrates CMA-ES style rank-one updates, step-size adaptation, and ASNG-inspired learning-rate adaptation to balance the two variable types, along with a mathematically derived margin correction to prevent premature convergence. Theoretical margin guarantees and extensive experiments show CatCMA achieves superior robustness and performance compared with state-of-the-art Bayesian MC-BBO methods like CASMOPOLITAN and TPE, especially in higher dimensions. The work provides a practical, scalable framework for jointly optimizing continuous and categorical decisions in complex black-box settings.
Abstract
Black-box optimization problems often require simultaneously optimizing different types of variables, such as continuous, integer, and categorical variables. Unlike integer variables, categorical variables do not necessarily have a meaningful order, and the discretization approach of continuous variables does not work well. Although several Bayesian optimization methods can deal with mixed-category black-box optimization (MC-BBO), they suffer from a lack of scalability to high-dimensional problems and internal computational cost. This paper proposes CatCMA, a stochastic optimization method for MC-BBO problems, which employs the joint probability distribution of multivariate Gaussian and categorical distributions as the search distribution. CatCMA updates the parameters of the joint probability distribution in the natural gradient direction. CatCMA also incorporates the acceleration techniques used in the covariance matrix adaptation evolution strategy (CMA-ES) and the stochastic natural gradient method, such as step-size adaptation and learning rate adaptation. In addition, we restrict the ranges of the categorical distribution parameters by margin to prevent premature convergence and analytically derive a promising margin setting. Numerical experiments show that the performance of CatCMA is superior and more robust to problem dimensions compared to state-of-the-art Bayesian optimization algorithms.
