Table of Contents
Fetching ...

Applying Multi-Fidelity Bayesian Optimization in Chemistry: Open Challenges and Major Considerations

Edmund Judge, Mohammed Azzouzi, Austin M. Mroz, Antonio del Rio Chanona, Kim E. Jelfs

TL;DR

This work investigates the application of MFBO to accelerate the identification of promising molecules or materials and addresses two key challenges, selecting the optimal acquisition function, understanding the impact of cost, and data fidelity correlation.

Abstract

Multi fidelity Bayesian optimization (MFBO) leverages experimental and or computational data of varying quality and resource cost to optimize towards desired maxima cost effectively. This approach is particularly attractive for chemical discovery due to MFBO's ability to integrate diverse data sources. Here, we investigate the application of MFBO to accelerate the identification of promising molecules or materials. We specifically analyze the conditions under which lower fidelity data can enhance performance compared to single-fidelity problem formulations. We address two key challenges, selecting the optimal acquisition function, understanding the impact of cost, and data fidelity correlation. We then discuss how to assess the effectiveness of MFBO for chemical discovery.

Applying Multi-Fidelity Bayesian Optimization in Chemistry: Open Challenges and Major Considerations

TL;DR

This work investigates the application of MFBO to accelerate the identification of promising molecules or materials and addresses two key challenges, selecting the optimal acquisition function, understanding the impact of cost, and data fidelity correlation.

Abstract

Multi fidelity Bayesian optimization (MFBO) leverages experimental and or computational data of varying quality and resource cost to optimize towards desired maxima cost effectively. This approach is particularly attractive for chemical discovery due to MFBO's ability to integrate diverse data sources. Here, we investigate the application of MFBO to accelerate the identification of promising molecules or materials. We specifically analyze the conditions under which lower fidelity data can enhance performance compared to single-fidelity problem formulations. We address two key challenges, selecting the optimal acquisition function, understanding the impact of cost, and data fidelity correlation. We then discuss how to assess the effectiveness of MFBO for chemical discovery.
Paper Structure (12 sections, 3 equations, 10 figures)

This paper contains 12 sections, 3 equations, 10 figures.

Figures (10)

  • Figure 1: The behaviour of the MF-MES (top) and SF-EI (bottom) search-algorithms for a single run optimizing (far-left) Problem 1 (RKHS), (middle-left) Problem 2 (6D negated Hartmann), (middle-right) Problem 3 (COF selectivities) and (far-right) Problem 4 (organic photovoltaic molecules). The different dashed lines denote the domain optimum and the obtained optimum.
  • Figure 2: A heatmap for the MF-MES search algorithm showing how the correlation and cost of the low-fidelity data influences the optimization rate for (left) Problem 1 (RKHS) and (right) Problem 2 (6D negated Hartmann).
  • Figure 3: Comparison of each of the acquisition functions for Problems (left) 1 (RKHS), (middle) 2 (6D negated Hartmann), (right) 3 (organic photovoltaic molecules).
  • Figure 4: High- and low-fidelity data for the RKHS function and their distributions (Problem 1).
  • Figure 5: Histogram of 6D Hartmann evaluations (Problem 2).
  • ...and 5 more figures