Table of Contents
Fetching ...

CheMatAgent: Enhancing LLMs for Chemistry and Materials Science through Tree-Search Based Tool Learning

Mengsong Wu, YaFei Wang, Yidong Ming, Yuqi An, Yuwei Wan, Wenliang Chen, Binbin Lin, Yuqiang Li, Tong Xie, Dongzhan Zhou

TL;DR

CheMatAgent tackles the problem of outdated chemistry knowledge in LLMs by integrating a large, domain-focused toolpool with a dataset and a hierarchical search framework. The core approach decouples tool planning and execution via HE-MCTS, utilizing self-generated data to perform step-level fine-tuning of the policy and training task-adaptive PRM and ORM critics that surpass GPT-4o on chemistry QA and discovery tasks. Key contributions include the largest 137-tool chemistry/materials pool, the ChemToolBench dataset, and the HE-MCTS framework with enhanced self-training, enabling autonomous, tool-rich reasoning without manual annotation. The work demonstrates improved tool selection and execution, scalable tool integration, and cross-domain generalization, offering a robust pathway for deploying specialized tools with LLMs in chemical research.

Abstract

Large language models (LLMs) have recently demonstrated promising capabilities in chemistry tasks while still facing challenges due to outdated pretraining knowledge and the difficulty of incorporating specialized chemical expertise. To address these issues, we propose an LLM-based agent that synergistically integrates 137 external chemical tools created ranging from basic information retrieval to complex reaction predictions, and a dataset curation pipeline to generate the dataset ChemToolBench that facilitates both effective tool selection and precise parameter filling during fine-tuning and evaluation. We introduce a Hierarchical Evolutionary Monte Carlo Tree Search (HE-MCTS) framework, enabling independent optimization of tool planning and execution. By leveraging self-generated data, our approach supports step-level fine-tuning (FT) of the policy model and training task-adaptive PRM and ORM that surpass GPT-4o. Experimental evaluations demonstrate that our approach significantly improves performance in Chemistry QA and discovery tasks, offering a robust solution to integrate specialized tools with LLMs for advanced chemical applications. All datasets and code are available at https://github.com/AI4Chem/ChemistryAgent .

CheMatAgent: Enhancing LLMs for Chemistry and Materials Science through Tree-Search Based Tool Learning

TL;DR

CheMatAgent tackles the problem of outdated chemistry knowledge in LLMs by integrating a large, domain-focused toolpool with a dataset and a hierarchical search framework. The core approach decouples tool planning and execution via HE-MCTS, utilizing self-generated data to perform step-level fine-tuning of the policy and training task-adaptive PRM and ORM critics that surpass GPT-4o on chemistry QA and discovery tasks. Key contributions include the largest 137-tool chemistry/materials pool, the ChemToolBench dataset, and the HE-MCTS framework with enhanced self-training, enabling autonomous, tool-rich reasoning without manual annotation. The work demonstrates improved tool selection and execution, scalable tool integration, and cross-domain generalization, offering a robust pathway for deploying specialized tools with LLMs in chemical research.

Abstract

Large language models (LLMs) have recently demonstrated promising capabilities in chemistry tasks while still facing challenges due to outdated pretraining knowledge and the difficulty of incorporating specialized chemical expertise. To address these issues, we propose an LLM-based agent that synergistically integrates 137 external chemical tools created ranging from basic information retrieval to complex reaction predictions, and a dataset curation pipeline to generate the dataset ChemToolBench that facilitates both effective tool selection and precise parameter filling during fine-tuning and evaluation. We introduce a Hierarchical Evolutionary Monte Carlo Tree Search (HE-MCTS) framework, enabling independent optimization of tool planning and execution. By leveraging self-generated data, our approach supports step-level fine-tuning (FT) of the policy model and training task-adaptive PRM and ORM that surpass GPT-4o. Experimental evaluations demonstrate that our approach significantly improves performance in Chemistry QA and discovery tasks, offering a robust solution to integrate specialized tools with LLMs for advanced chemical applications. All datasets and code are available at https://github.com/AI4Chem/ChemistryAgent .

Paper Structure

This paper contains 33 sections, 11 equations, 6 figures, 8 tables.

Figures (6)

  • Figure 1: Overview of our CheMatAgent.
  • Figure 2: Domain-specific Tool Learning dataset construction pipeline.
  • Figure 3: HE-MCTS pipeline.The left part presents the process of Search-Based Hierarchical inferring process. The right part denotes the self-training.
  • Figure 4: contrast of $D^p$ and $\tilde{D}^p$
  • Figure 5: H-MCTS process
  • ...and 1 more figures