Latent Bayesian Optimization via Autoregressive Normalizing Flows

Seunghun Lee; Jinyoung Park; Jaewon Chu; Minseo Yoon; Hyunwoo J. Kim

Latent Bayesian Optimization via Autoregressive Normalizing Flows

Seunghun Lee, Jinyoung Park, Jaewon Chu, Minseo Yoon, Hyunwoo J. Kim

TL;DR

This work tackles the value-discrepancy problem in Latent Bayesian Optimization by introducing NF-BO, which uses SeqFlow, an autoregressive normalizing flow, to achieve a one-to-one mapping between discrete sequences and a continuous latent space with exact reconstruction. It pairs SeqFlow with a token-level adaptive candidate sampling (TACS) strategy to enhance local search within trust regions, enabling efficient optimization for high-dimensional and structured data. Across Guacamol and PMO benchmarks, NF-BO consistently outperforms traditional BO and prior LBO approaches, demonstrating improved molecule design performance and robustness. The approach offers a general, invertible, and scalable framework for discrete sequence optimization with practical implications for chemical design and related domains.

Abstract

Bayesian Optimization (BO) has been recognized for its effectiveness in optimizing expensive and complex objective functions. Recent advancements in Latent Bayesian Optimization (LBO) have shown promise by integrating generative models such as variational autoencoders (VAEs) to manage the complexity of high-dimensional and structured data spaces. However, existing LBO approaches often suffer from the value discrepancy problem, which arises from the reconstruction gap between input and latent spaces. This value discrepancy problem propagates errors throughout the optimization process, leading to suboptimal outcomes. To address this issue, we propose a Normalizing Flow-based Bayesian Optimization (NF-BO), which utilizes normalizing flow as a generative model to establish one-to-one encoding function from the input space to the latent space, along with its left-inverse decoding function, eliminating the reconstruction gap. Specifically, we introduce SeqFlow, an autoregressive normalizing flow for sequence data. In addition, we develop a new candidate sampling strategy that dynamically adjusts the exploration probability for each token based on its importance. Through extensive experiments, our NF-BO method demonstrates superior performance in molecule generation tasks, significantly outperforming both traditional and recent LBO approaches.

Latent Bayesian Optimization via Autoregressive Normalizing Flows

TL;DR

Abstract

Latent Bayesian Optimization via Autoregressive Normalizing Flows

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (14)

Theorems & Definitions (6)