Table of Contents
Fetching ...

Few-Shot Design Optimization by Exploiting Auxiliary Information

Arjun Mani, Carl Vondrick, Richard Zemel

TL;DR

This work addresses expensive black-box design optimization when each trial returns a scalar reward \(f(\mathbf{x})\) and rich auxiliary information \(h(\mathbf{x})\), leveraging a history of related tasks to generalize to unseen tasks. It proposes a transformer-based neural surrogate \(P_\theta\) that performs few-shot probabilistic prediction of \(f(\mathbf{x})\) conditioned on a small context \(C\) containing observed \(h(\mathbf{x})\) and \(f(\mathbf{x})\), and integrates this surrogate into Bayesian Optimization without online fine-tuning. The method is validated on two challenging domains—robotic gripper design with tactile feedback and neural-network hyperparameter tuning on LCBench—alongside a large 4.28-million-evaluation gripper benchmark, demonstrating superior few-shot prediction and faster optimization compared with multi-task BO baselines and GP-based approaches. The results indicate that learning to represent and exploit auxiliary information across tasks can notably improve sample efficiency and design quality in information-rich experimental settings, with broad implications for AI-driven design and discovery.

Abstract

Many real-world design problems involve optimizing an expensive black-box function $f(x)$, such as hardware design or drug discovery. Bayesian Optimization has emerged as a sample-efficient framework for this problem. However, the basic setting considered by these methods is simplified compared to real-world experimental setups, where experiments often generate a wealth of useful information. We introduce a new setting where an experiment generates high-dimensional auxiliary information $h(x)$ along with the performance measure $f(x)$; moreover, a history of previously solved tasks from the same task family is available for accelerating optimization. A key challenge of our setting is learning how to represent and utilize $h(x)$ for efficiently solving new optimization tasks beyond the task history. We develop a novel approach for this setting based on a neural model which predicts $f(x)$ for unseen designs given a few-shot context containing observations of $h(x)$. We evaluate our method on two challenging domains, robotic hardware design and neural network hyperparameter tuning, and introduce a novel design problem and large-scale benchmark for the former. On both domains, our method utilizes auxiliary feedback effectively to achieve more accurate few-shot prediction and faster optimization of design tasks, significantly outperforming several methods for multi-task optimization.

Few-Shot Design Optimization by Exploiting Auxiliary Information

TL;DR

This work addresses expensive black-box design optimization when each trial returns a scalar reward \(f(\mathbf{x})\) and rich auxiliary information \(h(\mathbf{x})\), leveraging a history of related tasks to generalize to unseen tasks. It proposes a transformer-based neural surrogate that performs few-shot probabilistic prediction of \(f(\mathbf{x})\) conditioned on a small context containing observed \(h(\mathbf{x})\) and \(f(\mathbf{x})\), and integrates this surrogate into Bayesian Optimization without online fine-tuning. The method is validated on two challenging domains—robotic gripper design with tactile feedback and neural-network hyperparameter tuning on LCBench—alongside a large 4.28-million-evaluation gripper benchmark, demonstrating superior few-shot prediction and faster optimization compared with multi-task BO baselines and GP-based approaches. The results indicate that learning to represent and exploit auxiliary information across tasks can notably improve sample efficiency and design quality in information-rich experimental settings, with broad implications for AI-driven design and discovery.

Abstract

Many real-world design problems involve optimizing an expensive black-box function , such as hardware design or drug discovery. Bayesian Optimization has emerged as a sample-efficient framework for this problem. However, the basic setting considered by these methods is simplified compared to real-world experimental setups, where experiments often generate a wealth of useful information. We introduce a new setting where an experiment generates high-dimensional auxiliary information along with the performance measure ; moreover, a history of previously solved tasks from the same task family is available for accelerating optimization. A key challenge of our setting is learning how to represent and utilize for efficiently solving new optimization tasks beyond the task history. We develop a novel approach for this setting based on a neural model which predicts for unseen designs given a few-shot context containing observations of . We evaluate our method on two challenging domains, robotic hardware design and neural network hyperparameter tuning, and introduce a novel design problem and large-scale benchmark for the former. On both domains, our method utilizes auxiliary feedback effectively to achieve more accurate few-shot prediction and faster optimization of design tasks, significantly outperforming several methods for multi-task optimization.
Paper Structure (27 sections, 2 equations, 11 figures, 2 algorithms)

This paper contains 27 sections, 2 equations, 11 figures, 2 algorithms.

Figures (11)

  • Figure 1: Few-shot design optimization with our method. Part (a) shows examples of two design optimization problems in our setting, where for each problem, the design $x$ is indicated, the reward $f(x)$ evaluating design quality is shown, and the auxiliary information $h(x)$ obtained when evaluating a design is indicated. The first problem involves designing a robotic gripper that grasps an object as stably as possible, using tactile feedback of the object during each grasp attempt. In the second problem, evaluating a hyperparameter setting returns per-epoch learning curves as $h(x)$, which can provide useful information beyond the reward (e.g. indicating overfitting, as in the example above). Part (b) shows how our method performs design optimization in a loop. Our model $P_\theta$ accepts a few-shot context of observations for a design task, including $h(x)$, and predicts the reward $f(x)$ for an unobserved design. These predictions are used to select a promising new design $x_{t+1}$ for evaluation. The design is evaluated, the new observation is added to the model's context, and the next iteration begins with this updated context for $P_\theta$. At termination, the design $x^*$ with the highest observed reward is returned. $P_\theta$ is trained on a history of design tasks to acquire this few-shot prediction ability (described in Sec. \ref{['sec:method']}).
  • Figure 2: Model architecture for our method. (a) In our model for few-shot probabilistic prediction a small context set of $N_C$ observations is provided as conditioning, in which observations of $f(\mathbf{x})$ and $h(\mathbf{x})$ are available. A target set of size $N_T$ with only inputs $\mathbf{x}$ is also provided, encoded separately. The encoded data-points pass through a transformer where each target point can only attend to the context, and a prediction of $f(\mathbf{x})$ is made for each of the target points. (b) Zooms in to the context encoder that encodes each context point, showing its architecture when $h(\mathbf{x}_i)$ is a temporal sequence of $K$ observations. The temporal sequence, with a token added for $(\mathbf{x}_i, f(\mathbf{x}_i))$, is embedded by a sequence transformer, and added to an embedding of $(\mathbf{x}_i, f(\mathbf{x}_i))$ to obtain the final encoding $e_i^{ctxt}$.
  • Figure 3: Algorithm 1 Training Procedure
  • Figure 4: A tale of two grippers: examples of our grasping simulation. The figure above shows two gripper designs for grasping a chair object, which is a task in the test set. For both, the simulation starts by closing the gripper jaws and lifting up the object to a fixed height. Then, an initial disturbance force $F_{init}$ is applied (here, in the $-z$ direction). As long as the object remains in the grasp, the force is incremented by $0.1 \text{N}$, repeating until some maximum time $T_{max}$. The reward $f(\mathbf{x})$ is the maximum disturbance force applied before the grasp fails, measuring its stability. The top row shows a gripper with a relatively flat surface, which can only resist a mild disturbance before the object falls. Bottom shows a high-quality gripper discovered by our method during optimization. This gripper wraps around both chair legs right under the seat, while also clasping the front leg above the seat, thus resisting any downward or horizontal disturbances. See the close-up view for further clarity on this strategy. It maintains the grasp until the end, achieving greater reward than the first gripper.
  • Figure 5: Prediction results for gripper design. Left shows the MSE on the target set for different context sizes, averaged across test tasks. Our method significantly outperforms all baselines, including SoTA baselines that utilize reward alone (DGP, Ours w/o h) and a GP-based approach for utilizing auxiliary information (GP-H). Right shows a direct comparison to Ours w/o h, highlighting our method's improvement especially for smaller context sizes. For each context size, 10 context/target set pairs were sampled for each test task, and the MSE is averaged across all pairs and tasks. The MSE of each target set is summed over its 100 designs.
  • ...and 6 more figures