Bayesian Optimization of Catalysis With In-Context Learning
Mayk Caldas Ramos, Shane S. Michtavy, Marc D. Porosoff, Andrew D. White
TL;DR
The paper introduces BO-ICL, a framework that treats language-based experimental procedures as the design space and uses frozen large language models with in-context learning as surrogates in Bayesian optimization. By avoiding feature engineering or model retraining, it demonstrates efficient, zero-shot to few-shot catalyst optimization across ESOL solubility, OCM, alloy-interface, and RWGS datasets, including a real-world RWGS experiment that nears thermodynamic yield within six iterations. Key findings show that LLM-based surrogates can provide uncertainty estimates, achieve competitive or superior BO performance to traditional baselines, and operate directly in language space, enabling rapid, interpretable material design with open-source tooling. The work also discusses calibration, hallucination, data-leakage, and exploration-exploitation considerations, offering practical guidance for deploying language-driven BO in catalysis and broader materials science. Overall, BO-ICL redefines materials representation and accelerates discovery using natural language as a universal interface for optimization.
Abstract
Large language models (LLMs) can perform accurate classification with zero or few examples through in-context learning. We extend this capability to regression with uncertainty estimation using frozen LLMs (e.g., GPT-3.5, Gemini), enabling Bayesian optimization (BO) in natural language without explicit model training or feature engineering. We apply this to materials discovery by representing experimental catalyst synthesis and testing procedures as natural language prompts. A key challenge in materials discovery is the need to characterize suboptimal candidates, which slows progress. While BO is effective for navigating large design spaces, standard surrogate models like Gaussian processes assume smoothness and continuity, an assumption that fails in highly non-linear domains such as heterogeneous catalysis. Our task-agnostic BO workflow overcomes this by operating directly in language space, producing interpretable and actionable predictions without requiring structural or electronic descriptors. On benchmarks like aqueous solubility and oxidative coupling of methane (OCM), BO-ICL matches or outperforms Gaussian processes. In live experiments on the reverse water-gas shift (RWGS) reaction, BO-ICL identifies near-optimal multi-metallic catalysts within six iterations from a pool of 3,700 candidates. Our method redefines materials representation and accelerates discovery, with broad applications across catalysis, materials science, and AI. Code: https://github.com/ur-whitelab/BO-ICL.
