Diagnosing and fixing common problems in Bayesian optimization for molecule design

Austin Tripp; José Miguel Hernández-Lobato

Diagnosing and fixing common problems in Bayesian optimization for molecule design

Austin Tripp, José Miguel Hernández-Lobato

TL;DR

This work addresses why Bayesian optimization underperforms in molecule design and argues that hyperparameter choices—specifically prior width, smoothing, and search strategy—drive most of the gap. By diagnosing these issues and applying targeted fixes to a basic GP-BO with Morgan fingerprints on the PMO benchmark, the authors achieve state-of-the-art performance (AUC Top-10 = 16.303) compared to prior methods. They demonstrate that a carefully tuned, principled BO setup can outperform strong baselines, suggesting BO merits greater attention in ML for molecules. The study also highlights limitations and motivates future work on richer surrogates, multi-task/noisy settings, and broader acquisition-function experimentation.

Abstract

Bayesian optimization (BO) is a principled approach to molecular design tasks. In this paper we explain three pitfalls of BO which can cause poor empirical performance: an incorrect prior width, over-smoothing, and inadequate acquisition function maximization. We show that with these issues addressed, even a basic BO setup is able to achieve the highest overall performance on the PMO benchmark for molecule design (Gao et al 2022). These results suggest that BO may benefit from more attention in the machine learning for molecules community.

Diagnosing and fixing common problems in Bayesian optimization for molecule design

TL;DR

Abstract

Paper Structure (13 sections, 6 equations, 4 figures, 1 table, 1 algorithm)

This paper contains 13 sections, 6 equations, 4 figures, 1 table, 1 algorithm.

Introduction
Background on Bayesian optimization
Gaussian process surrogate models
Acquisition functions
Common Bayesian optimization pitfalls
Pitfall #1: prior width
Pitfall #2: over-smoothing
Pitfall #3: inadequate search
Experiments: fixing these issues substantially improves performance
Discussion
Details of BO setup
Full results
SMILES from Figure \ref{['fig:similar-molecules-binaryfp']}

Figures (4)

Figure 1: 1D optimization task meant to be qualitatively similar to molecular design tasks. Only a small number of data points are known (black dots), none of which are near the global optimum of the unknown function (red dashed line).
Figure 2: Effect of prior width parameter $\sigma$ in a GP model, illustrating "prior width" pitfall (§\ref{['pitfall:prior width']}). Low values of $\sigma$ cause the model to predict lower returns from exploration.
Figure 3: Effect of lengthscale parameter $\ell$ in a GP model, illustrating "over-smoothing" pitfall (§\ref{['pitfall:over smoothing']}). High values of $\ell$ also imply lower returns from exploring inputs near known inputs.
Figure 4: Two pairs of molecules whose binary Morgan fingerprints of radius 2 are identical. The top pair is two alkanes of different lengths, which only contain -CH$_3$ and -CH$_2$- groups. The bottom pair is the anti-inflammatory drug molecule Celecoxib and a larger analogue with many repeated substructures. SMILES strings are given in Appendix \ref{['apdx:smiles from figure']}.

Diagnosing and fixing common problems in Bayesian optimization for molecule design

TL;DR

Abstract

Diagnosing and fixing common problems in Bayesian optimization for molecule design

Authors

TL;DR

Abstract

Table of Contents

Figures (4)