Monte Carlo Thought Search: Large Language Model Querying for Complex Scientific Reasoning in Catalyst Design

Henry W. Sprueill; Carl Edwards; Mariefel V. Olarte; Udishnu Sanyal; Heng Ji; Sutanay Choudhury

Monte Carlo Thought Search: Large Language Model Querying for Complex Scientific Reasoning in Catalyst Design

Henry W. Sprueill, Carl Edwards, Mariefel V. Olarte, Udishnu Sanyal, Heng Ji, Sutanay Choudhury

TL;DR

The paper tackles the challenge of discovering novel catalysts within a combinatorial design space by leveraging large language models (LLMs) through a structured, tree-based prompting approach. It introduces the Monte Carlo Reasoner (MCR), which uses Monte Carlo Tree Search to explore a tree of prompt variants and optimize a domain-specific reward function based on adsorption energies to guide catalyst selection. Two datasets are introduced: BioFuelQR for complex catalysis reasoning and a chemistry-focused benchmark derived from OpenCatalysis OC20, with experiments showing substantial gains over Chain-of-Thought and related baselines and favorable expert assessments. The work demonstrates a zero-shot framework that augments scientific reasoning with LLMs, while acknowledging limitations such as computational cost and API reliance, and points to future integration with atomistic simulations for more trustworthy rewards.

Abstract

Discovering novel catalysts requires complex reasoning involving multiple chemical properties and resultant trade-offs, leading to a combinatorial growth in the search space. While large language models (LLM) have demonstrated novel capabilities for chemistry through complex instruction following capabilities and high quality reasoning, a goal-driven combinatorial search using LLMs has not been explored in detail. In this work, we present a Monte Carlo Tree Search-based approach that improves beyond state-of-the-art chain-of-thought prompting variants to augment scientific reasoning. We introduce two new reasoning datasets: 1) a curation of computational chemistry simulations, and 2) diverse questions written by catalysis researchers for reasoning about novel chemical conversion processes. We improve over the best baseline by 25.8\% and find that our approach can augment scientist's reasoning and discovery process with novel insights.

Monte Carlo Thought Search: Large Language Model Querying for Complex Scientific Reasoning in Catalyst Design

TL;DR

Abstract

Paper Structure (20 sections, 3 equations, 10 figures, 3 tables, 1 algorithm)

This paper contains 20 sections, 3 equations, 10 figures, 3 tables, 1 algorithm.

Introduction
Monte Carlo Reasoner
Problem definition
Reward Function
Experiments
Conclusion and Future Work
Background
Scientific Drivers from Catalysis
Motivation for molecular energy prediction as a reward function
Related work
Multi-modal models for Chemistry
LLMs for Chemistry
Chain-of-Thought (CoT) Variants
Dataset Design
Action-Driven Prompt Design
...and 5 more sections

Figures (10)

Figure 1: An example prompt design via tree search. The search begins with a generic query at the root node. The answer from each node is passed to the child nodes and additional criterion are added to the prompt. For instance, low cost. Information passed to children nodes is color coded to show the reasoning pathway.
Figure 2: Illustration of the combinatorial thinking used by human experts to reason about a catalyst (best viewed in color). They successively "think in terms of" different constraints and factors, each of which are related via scientific principles, and narrow down the set of possible candidates. Our Monte Carlo Reasoner emulates such cognitive thinking by prompting a language model with different combinations, yielding a tree-structured space of queries and potential candidates, and returns the optimal answer via efficient exploration of the possible space.
Figure 3: Domain expert evaluation of LLM answers on the reasoning path to the final node with highest reward.
Figure 4: Example queries from the BioFuelQR dataset representing reasoning with different combinations of chemical descriptors.
Figure 5: Example question and human answer from our compiled QA-dataset.
...and 5 more figures

Monte Carlo Thought Search: Large Language Model Querying for Complex Scientific Reasoning in Catalyst Design

TL;DR

Abstract

Monte Carlo Thought Search: Large Language Model Querying for Complex Scientific Reasoning in Catalyst Design

Authors

TL;DR

Abstract

Table of Contents

Figures (10)