Table of Contents
Fetching ...

Leveraging Axis-Aligned Subspaces for High-Dimensional Bayesian Optimization with Group Testing

Erik Hellsten, Carl Hvarfner, Leonard Papenmeier, Luigi Nardi

TL;DR

GTBO addresses the challenge of optimizing expensive, high-dimensional black-box functions by assuming an axis-aligned active subspace and identifying the active dimensions via a noisy adaptive group-testing stage extended to continuous GP objectives. It then concentrates the Bayesian optimization in the inferred subspace by tailoring GP priors and acquisition to emphasize the active variables. The main contributions are extending the group-testing framework to continuous domains, deriving an information-theoretic, MI-guided strategy for selecting variable groups, and demonstrating superior sample efficiency and interpretability on benchmarks where axis-aligned subspaces hold, with $d_e \ll D$. The approach offers a principled way to discover which features matter in high-dimensional optimization and yields practical gains in domains with sparse active dimensions.

Abstract

Bayesian optimization (BO ) is an effective method for optimizing expensive-to-evaluate black-box functions. While high-dimensional problems can be particularly challenging, due to the multitude of parameter choices and the potentially high number of data points required to fit the model, this limitation can be addressed if the problem satisfies simplifying assumptions. Axis-aligned subspace approaches, where few dimensions have a significant impact on the objective, motivated several algorithms for high-dimensional BO . However, the validity of this assumption is rarely verified, and the assumption is rarely exploited to its full extent. We propose a group testing ( GT) approach to identify active variables to facilitate efficient optimization in these domains. The proposed algorithm, Group Testing Bayesian Optimization (GTBO), first runs a testing phase where groups of variables are systematically selected and tested on whether they influence the objective, then terminates once active dimensions are identified. To that end, we extend the well-established GT theory to functions over continuous domains. In the second phase, GTBO guides optimization by placing more importance on the active dimensions. By leveraging the axis-aligned subspace assumption, GTBO outperforms state-of-the-art methods on benchmarks satisfying the assumption of axis-aligned subspaces, while offering improved interpretability.

Leveraging Axis-Aligned Subspaces for High-Dimensional Bayesian Optimization with Group Testing

TL;DR

GTBO addresses the challenge of optimizing expensive, high-dimensional black-box functions by assuming an axis-aligned active subspace and identifying the active dimensions via a noisy adaptive group-testing stage extended to continuous GP objectives. It then concentrates the Bayesian optimization in the inferred subspace by tailoring GP priors and acquisition to emphasize the active variables. The main contributions are extending the group-testing framework to continuous domains, deriving an information-theoretic, MI-guided strategy for selecting variable groups, and demonstrating superior sample efficiency and interpretability on benchmarks where axis-aligned subspaces hold, with . The approach offers a principled way to discover which features matter in high-dimensional optimization and yields practical gains in domains with sparse active dimensions.

Abstract

Bayesian optimization (BO ) is an effective method for optimizing expensive-to-evaluate black-box functions. While high-dimensional problems can be particularly challenging, due to the multitude of parameter choices and the potentially high number of data points required to fit the model, this limitation can be addressed if the problem satisfies simplifying assumptions. Axis-aligned subspace approaches, where few dimensions have a significant impact on the objective, motivated several algorithms for high-dimensional BO . However, the validity of this assumption is rarely verified, and the assumption is rarely exploited to its full extent. We propose a group testing ( GT) approach to identify active variables to facilitate efficient optimization in these domains. The proposed algorithm, Group Testing Bayesian Optimization (GTBO), first runs a testing phase where groups of variables are systematically selected and tested on whether they influence the objective, then terminates once active dimensions are identified. To that end, we extend the well-established GT theory to functions over continuous domains. In the second phase, GTBO guides optimization by placing more importance on the active dimensions. By leveraging the axis-aligned subspace assumption, GTBO outperforms state-of-the-art methods on benchmarks satisfying the assumption of axis-aligned subspaces, while offering improved interpretability.

Paper Structure

This paper contains 27 sections, 6 equations, 10 figures, 1 table, 1 algorithm.

Figures (10)

  • Figure 1: GTBO assumes an axis-aligned subspace. A point $x_1$ that only varies along inactive dimensions ($d_2$ and $d_4$) obtains a similar function value as the default point $(x_\textrm{def})$. Points $x_2$ and $x_3$ that vary along active dimensions ($d_1$ and $d_3$) have a higher likelihood under the signal distribution than under the noise distribution.
  • Figure 2: Evolution of the average marginal probability of being active across ten repetitions. Each line represents one dimension; active dimensions are colored green, and inactive dimensions are blue. In the few cases where GTBO finds inactive variables to be active, the lines are emphasized in red. The last iteration marks the end of the longestGT phase across all runs. All active dimensions are identified in all runs. 6 out of 1180 inactive dimensions are incorrectly classified as active once in ten runs across the benchmarks.
  • Figure 3: Sensitivity analysis for GTBO. The average percentage of correctly classified variables is displayed for increasing GT iterations. The percentage is ablated for (left) various levels of output noise, (middle) number of total dimensions, and (right) number of effective dimensions. Each legend shows the points of the respective parameter.
  • Figure 4: Mean logarithmic regret for four embedded synthetic benchmarks. The shaded regions indicate one standard error. GTBO finds active dimensions and subsequently optimizes efficiently.
  • Figure 5: GTBO outperforms competitors in real-world experiments. Notably, the performance on Mopta08 increases significantly after the GT phase at iteration 300, suggesting that the dimensions found during the GT phase are highly relevant.
  • ...and 5 more figures