Table of Contents
Fetching ...

Constrained multi-fidelity Bayesian optimization with automatic stop condition

Zahra Zanjani Foumani, Ramin Bostanabad

TL;DR

This work addresses the high evaluation cost of Bayesian optimization in constrained, multi-fidelity settings by proposing CMFBO, a Gaussian-process-based framework that fuses HF and LF data through mixed-input emulation and source-aware uncertainty. It introduces a constrained, cost-aware acquisition that balances information gain with sampling costs and handles unknown constraints via source-specific GPs, supported by a novel automatic stopping criterion based on PAO stability. The approach achieves lower overall sampling costs and robust performance across a suite of analytic benchmarks (3–20 dimensions, varying noise) compared with state-of-the-art MFBO baselines, and is implemented in the open-source GP+ package. These contributions provide a practical, scalable method for efficient constrained optimization in domains where multiple fidelity data sources and feasibility constraints are prevalent.

Abstract

Bayesian optimization (BO) is increasingly employed in critical applications to find the optimal design with minimal cost. While BO is known for its sample efficiency, relying solely on costly high-fidelity data can still result in high costs. This is especially the case in constrained search spaces where BO must not only optimize but also ensure feasibility. A related issue in the BO literature is the lack of a systematic stopping criterion. To solve these challenges, we develop a constrained cost-aware multi-fidelity BO (CMFBO) framework whose goal is to minimize overall sampling costs by utilizing inexpensive low-fidelity sources while ensuring feasibility. In our case, the constraints can change across the data sources and may be even black-box functions. We also introduce a systematic stopping criterion that addresses the long-lasting issue associated with BO's convergence assessment. Our framework is publicly available on GitHub through the GP+ Python package and herein we validate it's efficacy on multiple benchmark problems.

Constrained multi-fidelity Bayesian optimization with automatic stop condition

TL;DR

This work addresses the high evaluation cost of Bayesian optimization in constrained, multi-fidelity settings by proposing CMFBO, a Gaussian-process-based framework that fuses HF and LF data through mixed-input emulation and source-aware uncertainty. It introduces a constrained, cost-aware acquisition that balances information gain with sampling costs and handles unknown constraints via source-specific GPs, supported by a novel automatic stopping criterion based on PAO stability. The approach achieves lower overall sampling costs and robust performance across a suite of analytic benchmarks (3–20 dimensions, varying noise) compared with state-of-the-art MFBO baselines, and is implemented in the open-source GP+ package. These contributions provide a practical, scalable method for efficient constrained optimization in domains where multiple fidelity data sources and feasibility constraints are prevalent.

Abstract

Bayesian optimization (BO) is increasingly employed in critical applications to find the optimal design with minimal cost. While BO is known for its sample efficiency, relying solely on costly high-fidelity data can still result in high costs. This is especially the case in constrained search spaces where BO must not only optimize but also ensure feasibility. A related issue in the BO literature is the lack of a systematic stopping criterion. To solve these challenges, we develop a constrained cost-aware multi-fidelity BO (CMFBO) framework whose goal is to minimize overall sampling costs by utilizing inexpensive low-fidelity sources while ensuring feasibility. In our case, the constraints can change across the data sources and may be even black-box functions. We also introduce a systematic stopping criterion that addresses the long-lasting issue associated with BO's convergence assessment. Our framework is publicly available on GitHub through the GP+ Python package and herein we validate it's efficacy on multiple benchmark problems.

Paper Structure

This paper contains 17 sections, 23 equations, 4 figures, 1 table.

Figures (4)

  • Figure 1: MF Modeling with GPs: The training data is built by first augmenting the inputs with the categorical feature $s$ and then concatenating all the inputs and outputs. For MF emulation, we recommend using two manifolds to simplify the visualization of cross-source relationships: one manifold for $s$ and the other for the rest of the categorical variables, i.e., $\boldsymbol{t}$.
  • Figure 2: Mixed vs Single Basis Functions: Two generic options are defined in GP+ for building the mean function: $(1)$ mixed basis where multiple bases can be defined for each data source, $(2)$ single basis where a global shared function is learned for all the data sources.
  • Figure 3: Convergence values and costs (small noise): The insets provide a magnified view of the variations.
  • Figure 4: Convergence values and costs (large noise): The insets provide a magnified view of the variations.