Table of Contents
Fetching ...

CACTUS: Chemistry Agent Connecting Tool-Usage to Science

Andrew D. McNaughton, Gautham Ramalaxmi, Agustin Kruel, Carter R. Knutson, Rohith A. Varikoti, Neeraj Kumar

TL;DR

CACTUS is introduced, an LLM-based agent that integrates existing cheminformatics tools to enable accurate and advanced reasoning and problem-solving in chemistry and molecular discovery and can assist researchers in tasks such as molecular property prediction, similarity searching, and drug-likeness assessment.

Abstract

Large language models (LLMs) have shown remarkable potential in various domains, but they often lack the ability to access and reason over domain-specific knowledge and tools. In this paper, we introduced CACTUS (Chemistry Agent Connecting Tool-Usage to Science), an LLM-based agent that integrates cheminformatics tools to enable advanced reasoning and problem-solving in chemistry and molecular discovery. We evaluate the performance of CACTUS using a diverse set of open-source LLMs, including Gemma-7b, Falcon-7b, MPT-7b, Llama2-7b, and Mistral-7b, on a benchmark of thousands of chemistry questions. Our results demonstrate that CACTUS significantly outperforms baseline LLMs, with the Gemma-7b and Mistral-7b models achieving the highest accuracy regardless of the prompting strategy used. Moreover, we explore the impact of domain-specific prompting and hardware configurations on model performance, highlighting the importance of prompt engineering and the potential for deploying smaller models on consumer-grade hardware without significant loss in accuracy. By combining the cognitive capabilities of open-source LLMs with domain-specific tools, CACTUS can assist researchers in tasks such as molecular property prediction, similarity searching, and drug-likeness assessment. Furthermore, CACTUS represents a significant milestone in the field of cheminformatics, offering an adaptable tool for researchers engaged in chemistry and molecular discovery. By integrating the strengths of open-source LLMs with domain-specific tools, CACTUS has the potential to accelerate scientific advancement and unlock new frontiers in the exploration of novel, effective, and safe therapeutic candidates, catalysts, and materials. Moreover, CACTUS's ability to integrate with automated experimentation platforms and make data-driven decisions in real time opens up new possibilities for autonomous discovery.

CACTUS: Chemistry Agent Connecting Tool-Usage to Science

TL;DR

CACTUS is introduced, an LLM-based agent that integrates existing cheminformatics tools to enable accurate and advanced reasoning and problem-solving in chemistry and molecular discovery and can assist researchers in tasks such as molecular property prediction, similarity searching, and drug-likeness assessment.

Abstract

Large language models (LLMs) have shown remarkable potential in various domains, but they often lack the ability to access and reason over domain-specific knowledge and tools. In this paper, we introduced CACTUS (Chemistry Agent Connecting Tool-Usage to Science), an LLM-based agent that integrates cheminformatics tools to enable advanced reasoning and problem-solving in chemistry and molecular discovery. We evaluate the performance of CACTUS using a diverse set of open-source LLMs, including Gemma-7b, Falcon-7b, MPT-7b, Llama2-7b, and Mistral-7b, on a benchmark of thousands of chemistry questions. Our results demonstrate that CACTUS significantly outperforms baseline LLMs, with the Gemma-7b and Mistral-7b models achieving the highest accuracy regardless of the prompting strategy used. Moreover, we explore the impact of domain-specific prompting and hardware configurations on model performance, highlighting the importance of prompt engineering and the potential for deploying smaller models on consumer-grade hardware without significant loss in accuracy. By combining the cognitive capabilities of open-source LLMs with domain-specific tools, CACTUS can assist researchers in tasks such as molecular property prediction, similarity searching, and drug-likeness assessment. Furthermore, CACTUS represents a significant milestone in the field of cheminformatics, offering an adaptable tool for researchers engaged in chemistry and molecular discovery. By integrating the strengths of open-source LLMs with domain-specific tools, CACTUS has the potential to accelerate scientific advancement and unlock new frontiers in the exploration of novel, effective, and safe therapeutic candidates, catalysts, and materials. Moreover, CACTUS's ability to integrate with automated experimentation platforms and make data-driven decisions in real time opens up new possibilities for autonomous discovery.
Paper Structure (14 sections, 5 equations, 4 figures, 2 tables)

This paper contains 14 sections, 5 equations, 4 figures, 2 tables.

Figures (4)

  • Figure 1: General workflow of the CACTUS Agent that details how the LLM interprets an input to arrive at the correct tool to use to obtain an answer. Starting from the user input, CACTUS follows a standard "Chain-of-thought" reasoning method with a Planning, Action, Execution, and Observation phase to obtain an informed output
  • Figure 2: Comparison of the Gemma -7b model with different prompting strategies on the full question set benchmark shows significant improvement in the qualitative question set when comparing the minimal prompt (Figure \ref{['fig:no_prompt']}) to the domain prompt (Figure \ref{['fig:prompt']}), while demonstrating similar performance in the quantitative question set.
  • Figure 3: Comparison of model performance among 7B parameter models using minimal and domain-specific prompts. The Gemma-7b and Mistral-7b models demonstrate strong performance and adaptability across prompting strategies, highlighting their potential for widespread applicability in various computational settings, from high-performance clusters to more modest research setups.
  • Figure 4: Comparison of model performance using accuracy and execution time as key metrics. The study evaluates various open-source models available on the HuggingFace including Gemma-7b, Falcon-7b, MPT-7b, Llama2-7b, and Mistral-7b, phi2 and olmo1b. Different combinations of conditions, such as model type (Vicuna, LLaMa, MPT), prompting strategy (minimal or domain-specific), GPU hardware (A100, V100, or consumer-grade), and benchmark size (small or large) were used to assess the model's capabilities.