Chemical reasoning in LLMs unlocks strategy-aware synthesis planning and reaction mechanism elucidation
Andres M Bran, Theo A Neukomm, Daniel P Armstrong, Zlatko Jončev, Philippe Schwaller
TL;DR
The paper reframes chemical reasoning as a problem of guiding traditional search with LLM-based strategic evaluation, rather than direct structure generation. It introduces Synthegy, a framework that couples AiZynthfinder-style retrosynthesis with LLMs to satisfy strategic constraints and to elucidate reaction mechanisms via guided search. Results show that state-of-the-art (and open) LLMs can meaningfully assess route feasibility and propose plausible mechanisms, with larger models achieving higher fidelity and explainability. The work highlights both the promise of human-like chemical reasoning in automation and current limitations (long sequences, input formatting, and model biases) while outlining a path toward more intuitive, capable computer-aided chemistry systems.
Abstract
While automated chemical tools excel at specific tasks, they have struggled to capture the strategic thinking that characterizes expert chemical reasoning. Here we demonstrate that large language models (LLMs) can serve as powerful tools enabling chemical analysis. When integrated with traditional search algorithms, they enable a new approach to computer-aided synthesis that mirrors human expert thinking. Rather than using LLMs to directly manipulate chemical structures, we leverage their ability to evaluate chemical strategies and guide search algorithms toward chemically meaningful solutions. We demonstrate this paradigm through two fundamental challenges: strategy-aware retrosynthetic planning and mechanism elucidation. In retrosynthetic planning, our system allows chemists to specify desired synthetic strategies in natural language -- from protecting group strategies to global feasibility assessment -- and uses traditional or LLM-guided Monte Carlo Tree Search to find routes that satisfy these constraints. In mechanism elucidation, LLMs guide the search for plausible reaction mechanisms by combining chemical principles with systematic exploration. This approach shows strong performance across diverse chemical tasks, with newer and larger models demonstrating increasingly sophisticated chemical reasoning. Our approach establishes a new paradigm for computer-aided chemistry that combines the strategic understanding of LLMs with the precision of traditional chemical tools, opening possibilities for more intuitive and powerful chemical automation systems.
