Table of Contents
Fetching ...

Latent Bayesian Optimization via Autoregressive Normalizing Flows

Seunghun Lee, Jinyoung Park, Jaewon Chu, Minseo Yoon, Hyunwoo J. Kim

TL;DR

This work tackles the value-discrepancy problem in Latent Bayesian Optimization by introducing NF-BO, which uses SeqFlow, an autoregressive normalizing flow, to achieve a one-to-one mapping between discrete sequences and a continuous latent space with exact reconstruction. It pairs SeqFlow with a token-level adaptive candidate sampling (TACS) strategy to enhance local search within trust regions, enabling efficient optimization for high-dimensional and structured data. Across Guacamol and PMO benchmarks, NF-BO consistently outperforms traditional BO and prior LBO approaches, demonstrating improved molecule design performance and robustness. The approach offers a general, invertible, and scalable framework for discrete sequence optimization with practical implications for chemical design and related domains.

Abstract

Bayesian Optimization (BO) has been recognized for its effectiveness in optimizing expensive and complex objective functions. Recent advancements in Latent Bayesian Optimization (LBO) have shown promise by integrating generative models such as variational autoencoders (VAEs) to manage the complexity of high-dimensional and structured data spaces. However, existing LBO approaches often suffer from the value discrepancy problem, which arises from the reconstruction gap between input and latent spaces. This value discrepancy problem propagates errors throughout the optimization process, leading to suboptimal outcomes. To address this issue, we propose a Normalizing Flow-based Bayesian Optimization (NF-BO), which utilizes normalizing flow as a generative model to establish one-to-one encoding function from the input space to the latent space, along with its left-inverse decoding function, eliminating the reconstruction gap. Specifically, we introduce SeqFlow, an autoregressive normalizing flow for sequence data. In addition, we develop a new candidate sampling strategy that dynamically adjusts the exploration probability for each token based on its importance. Through extensive experiments, our NF-BO method demonstrates superior performance in molecule generation tasks, significantly outperforming both traditional and recent LBO approaches.

Latent Bayesian Optimization via Autoregressive Normalizing Flows

TL;DR

This work tackles the value-discrepancy problem in Latent Bayesian Optimization by introducing NF-BO, which uses SeqFlow, an autoregressive normalizing flow, to achieve a one-to-one mapping between discrete sequences and a continuous latent space with exact reconstruction. It pairs SeqFlow with a token-level adaptive candidate sampling (TACS) strategy to enhance local search within trust regions, enabling efficient optimization for high-dimensional and structured data. Across Guacamol and PMO benchmarks, NF-BO consistently outperforms traditional BO and prior LBO approaches, demonstrating improved molecule design performance and robustness. The approach offers a general, invertible, and scalable framework for discrete sequence optimization with practical implications for chemical design and related domains.

Abstract

Bayesian Optimization (BO) has been recognized for its effectiveness in optimizing expensive and complex objective functions. Recent advancements in Latent Bayesian Optimization (LBO) have shown promise by integrating generative models such as variational autoencoders (VAEs) to manage the complexity of high-dimensional and structured data spaces. However, existing LBO approaches often suffer from the value discrepancy problem, which arises from the reconstruction gap between input and latent spaces. This value discrepancy problem propagates errors throughout the optimization process, leading to suboptimal outcomes. To address this issue, we propose a Normalizing Flow-based Bayesian Optimization (NF-BO), which utilizes normalizing flow as a generative model to establish one-to-one encoding function from the input space to the latent space, along with its left-inverse decoding function, eliminating the reconstruction gap. Specifically, we introduce SeqFlow, an autoregressive normalizing flow for sequence data. In addition, we develop a new candidate sampling strategy that dynamically adjusts the exploration probability for each token based on its importance. Through extensive experiments, our NF-BO method demonstrates superior performance in molecule generation tasks, significantly outperforming both traditional and recent LBO approaches.

Paper Structure

This paper contains 42 sections, 4 theorems, 21 equations, 14 figures, 9 tables, 1 algorithm.

Key Result

Proposition 1

Let $g$ be Normalizing Flows and $h$ is an injective function with a nonempty domain $\mathcal{X}$. Then, $f:=g \circ h$ is left invertible, i.e., $f^{-1}\circ f = \text{id}_X$, where $h^{-1}$ is the left inverse of $h$ and $f^{-1}:=h^{-1} \circ g^{-1}$.

Figures (14)

  • Figure 1: Visualization of value discrepancy problem.
  • Figure 2: (a) Most existing LBO approaches suffer from the value discrepancy problem $y \ne \hat{y}$ induced by the reconstruction gap, $p_\theta(q_\phi(\mathbf{x}))\ne \mathbf{x}$. This results in that the latent representation $\mathbf{z}$ corresponds to different evaluation values $y$ and $\hat{y}$ due to the reconstruction error, where $\mathbf{x} \neq \hat{\mathbf{x}}$. (b) Our NF-BO effectively addresses the value discrepancy problem by employing a normalizing flow model that ensures one-to-one mapping between $\mathbf{x}$ and $\mathbf{z}$ via the invertible flow and inverse processes, $\boldsymbol{g}$ and $\boldsymbol{g}^{-1}$, i.e., $\boldsymbol{g}^{-1}(\boldsymbol{g}(\mathbf{x}))=\mathbf{x}$. So, the latent representation $\mathbf{z}$ is consistently associated with the same evaluation value $y$.
  • Figure 3: Overall pipeline of SeqFlow. Given the input space of a sequence discrete values $\mathbf{x}$, SeqFlow first maps the discrete values $\mathbf{x}$ to continuous representation$\mathbf{v}$ and efficiently transforms them via autoregressive transformations $\{g^i\}_{i=0}^{K-1}$ to a latent representation $\mathbf{z}^0$ in the encoding phase (top pathway). In the decoding phase (bottom pathway), SeqFlow reconstructs $\mathbf{x}$ from $\mathbf{z}^0$ through the inverse of transformations. SeqFlow ensures the perfect reconstruction of the discrete input.
  • Figure 4: Overview of NF-BO. We employ our normalizing flows, SeqFlow, as a mapping function between a discrete input space and a continuous latent space. Each discrete input token $\mathbf{x}_i$ is mapped to its corresponding embedding vector $\mathbf{v}_i$ from the dictionary. A surrogate model is then trained using the latent representation $\mathbf z$ encoded by the flow model $\boldsymbol{g}$ and the associated function value $y$ to emulate the objective function. To enhance the efficiency of trust region-based local search, we propose a Token-level Adaptive Candidate Sampling (TACS). In TACS, candidates for the acquisition function are generated by perturbing tokens, sampled according to a token-level sampling probability $\pi$, specified in Eq. (\ref{['eq:TACS']}). Given these candidates and the surrogate model, we select the next query points $\tilde{\mathbf{z}}$ by the acquisition function. Next, the inverse model $\boldsymbol{g}^{-1}$ generates the embedding $\mathbf{\tilde{v}}$ and searches the most similar embedding and return the corresponding index as a $\tilde{\mathbf x}$.
  • Figure 5: Optimization results of NF-BO on Guacamol benchmarks comparing performance with baselines under two oracle budget settings: (100, 500) (left) and (10,000, 10,000) (right). The shaded regions indicate the standard error over 5 trials.
  • ...and 9 more figures

Theorems & Definitions (6)

  • Proposition 1
  • Proposition 2
  • Proposition 1
  • proof
  • Proposition 2
  • proof