Learning Surrogates for Offline Black-Box Optimization via Gradient Matching

Minh Hoang; Azza Fadhel; Aryan Deshwal; Janardhan Rao Doppa; Trong Nghia Hoang

Learning Surrogates for Offline Black-Box Optimization via Gradient Matching

Minh Hoang, Azza Fadhel, Aryan Deshwal, Janardhan Rao Doppa, Trong Nghia Hoang

TL;DR

The paper tackles offline black-box optimization, where surrogates learned from offline data may misguide gradient-based search outside the data regime. It develops a theoretical bound showing the offline optimization gap is controlled by how closely the surrogate's gradient matches the true gradient, and introduces MATCH-OPT, a gradient-matching surrogate learning algorithm that leverages line-integral gradient information and monotonic trajectories from offline data. Theoretical results are complemented by extensive experiments on six design benchmarks, where MATCH-OPT consistently achieves reliable, competitive performance and improvements over strong baselines. This work provides a principled, practical path to more robust offline optimization with potential impact on material, chemical, and hardware design problems.

Abstract

Offline design optimization problem arises in numerous science and engineering applications including material and chemical design, where expensive online experimentation necessitates the use of in silico surrogate functions to predict and maximize the target objective over candidate designs. Although these surrogates can be learned from offline data, their predictions are often inaccurate outside the offline data regime. This challenge raises a fundamental question about the impact of imperfect surrogate model on the performance gap between its optima and the true optima, and to what extent the performance loss can be mitigated. Although prior work developed methods to improve the robustness of surrogate models and their associated optimization processes, a provably quantifiable relationship between an imperfect surrogate and the corresponding performance gap, as well as whether prior methods directly address it, remain elusive. To shed light on this important question, we present a theoretical framework to understand offline black-box optimization, by explicitly bounding the optimization quality based on how well the surrogate matches the latent gradient field that underlines the offline data. Inspired by our theoretical analysis, we propose a principled black-box gradient matching algorithm to create effective surrogate models for offline optimization, improving over prior approaches on various real-world benchmarks.

Learning Surrogates for Offline Black-Box Optimization via Gradient Matching

TL;DR

Abstract

Learning Surrogates for Offline Black-Box Optimization via Gradient Matching

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (9)