High Dimensional Bayesian Optimization using Lasso Variable Selection
Vu Viet Hoang, Hung The Tran, Sunil Gupta, Vu Nguyen
TL;DR
This work tackles the challenge of scaling Bayesian optimization to high-dimensional problems by introducing LassoBO, which uses an $\ell_1$-regularized marginal likelihood to estimate inverse length scales $\rho_i$ and identify important variables. It then builds a variable-importance subspace and imputes unimportant variables to form multiple subspaces, enabling acquisition optimization to focus on the informative dimensions while maintaining exploration. The authors provide a sublinear cumulative regret bound under kernel-based smoothness assumptions and demonstrate state-of-the-art performance on synthetic benchmarks and real-world tasks like Rover, MuJoCo, and DNA, highlighting improved efficiency and scalability. Overall, LassoBO offers a theoretically grounded, practical approach to high-dimensional BO by adaptively learning active subspaces and leveraging sparsity in kernel length scales to guide search.
Abstract
Bayesian optimization (BO) is a leading method for optimizing expensive black-box optimization and has been successfully applied across various scenarios. However, BO suffers from the curse of dimensionality, making it challenging to scale to high-dimensional problems. Existing work has adopted a variable selection strategy to select and optimize only a subset of variables iteratively. Although this approach can mitigate the high-dimensional challenge in BO, it still leads to sample inefficiency. To address this issue, we introduce a novel method that identifies important variables by estimating the length scales of Gaussian process kernels. Next, we construct an effective search region consisting of multiple subspaces and optimize the acquisition function within this region, focusing on only the important variables. We demonstrate that our proposed method achieves cumulative regret with a sublinear growth rate in the worst case while maintaining computational efficiency. Experiments on high-dimensional synthetic functions and real-world problems show that our method achieves state-of-the-art performance.
