ZeroShotOpt: Towards Zero-Shot Pretrained Models for Efficient Black-Box Optimization
Jamison Meindl, Yunsheng Tian, Tony Cui, Veronika Thost, Zhang-Wei Hong, Johannes Dürholt, Jie Chen, Wojciech Matusik, Mina Konaković Luković
TL;DR
This work addresses the challenge of efficiently optimizing expensive, derivative-free black-box functions under tight budgets by introducing ZeroShotOpt, a pretrained transformer-based optimizer for continuous problems up to $20$D. It trains a $200$M-parameter decoder-only transformer through offline reinforcement learning on a massive corpus of optimization trajectories generated from $12$ Bayesian optimization variants and millions of GP-based synthetic functions, enabling robust zero-shot generalization to unseen benchmarks. The model demonstrates competitive sample efficiency with traditional Bayesian optimization across in- and out-of-distribution tasks, while offering a reusable foundation that can be fine-tuned to specific domains such as HPO-B. Moreover, ZeroShotOpt provides practical advantages in runtime and scalability, with open-source data and code to support further extensions and real-world deployment.
Abstract
Global optimization of expensive, derivative-free black-box functions requires extreme sample efficiency. While Bayesian optimization (BO) is the current state-of-the-art, its performance hinges on surrogate and acquisition function hyper-parameters that are often hand-tuned and fail to generalize across problem landscapes. We present ZeroShotOpt, a general-purpose, pretrained model for continuous black-box optimization tasks ranging from 2D to 20D. Our approach leverages offline reinforcement learning on large-scale optimization trajectories collected from 12 BO variants. To scale pretraining, we generate millions of synthetic Gaussian process-based functions with diverse landscapes, enabling the model to learn transferable optimization policies. As a result, ZeroShotOpt achieves robust zero-shot generalization on a wide array of unseen benchmarks, matching or surpassing the sample efficiency of leading global optimizers, including BO, while also offering a reusable foundation for future extensions and improvements. Our open-source code, dataset, and model are available at: https://github.com/jamisonmeindl/zeroshotopt
