Guided Trajectory Generation with Diffusion Models for Offline Model-based Optimization
Taeyoung Yun, Sujin Yun, Jaewoo Lee, Jinkyoo Park
TL;DR
This work tackles offline MBO by reframing design optimization as learning to improve through synthetic trajectories. It introduces GTG, a conditional diffusion framework that constructs locality-biased trajectories from an offline dataset, trains a conditional diffusion model to generate trajectory sequences conditioned on their scores, and uses classifier-free guidance with context conditioning to explore high-scoring regions beyond the data, followed by proxy-based selection. Empirical results on Design-Bench and a toy Branin task show GTG outperforms competitive baselines, including in sparse and noisy data regimes, demonstrating strong generalization to unseen regions of the landscape. The approach advances offline optimization by leveraging trajectory-level generative modeling to capture multi-step improvements and landscape structure, with practical availability of code.
Abstract
Optimizing complex and high-dimensional black-box functions is ubiquitous in science and engineering fields. Unfortunately, the online evaluation of these functions is restricted due to time and safety constraints in most cases. In offline model-based optimization (MBO), we aim to find a design that maximizes the target function using only a pre-existing offline dataset. While prior methods consider forward or inverse approaches to address the problem, these approaches are limited by conservatism and the difficulty of learning highly multi-modal mappings. Recently, there has been an emerging paradigm of learning to improve solutions with synthetic trajectories constructed from the offline dataset. In this paper, we introduce a novel conditional generative modeling approach to produce trajectories toward high-scoring regions. First, we construct synthetic trajectories toward high-scoring regions using the dataset while injecting locality bias for consistent improvement directions. Then, we train a conditional diffusion model to generate trajectories conditioned on their scores. Lastly, we sample multiple trajectories from the trained model with guidance to explore high-scoring regions beyond the dataset and select high-fidelity designs among generated trajectories with the proxy function. Extensive experiment results demonstrate that our method outperforms competitive baselines on Design-Bench and its practical variants. The code is publicly available in \texttt{https://github.com/dbsxodud-11/GTG}.
