Table of Contents
Fetching ...

MT-Mol:Multi Agent System with Tool-based Reasoning for Molecular Optimization

Hyomin Kim, Yunhui Jang, Sungsoo Ahn

TL;DR

MT-Mol introduces a multi-agent LLM framework for molecular optimization that grounds design in tool-guided reasoning using RDKit. By decomposing tasks into analyst, scientist, verifier, and reviewer roles, and enforcing structured, stepwise reasoning with tool-informed feedback, MT-Mol achieves state-of-the-art performance on the PMO-1K benchmark across 17 of 23 tasks while maintaining chemical interpretability. The approach demonstrates the value of explicit collaboration and domain-specific tool integration in generating chemically valid, high-quality molecules under budget-constrained settings. This framework has practical implications for transparent AI-assisted molecular design and educational dissemination of chemical design reasoning, albeit with limitations related to tooling scope and language coverage.

Abstract

Large language models (LLMs) have large potential for molecular optimization, as they can gather external chemistry tools and enable collaborative interactions to iteratively refine molecular candidates. However, this potential remains underexplored, particularly in the context of structured reasoning, interpretability, and comprehensive tool-grounded molecular optimization. To address this gap, we introduce MT-Mol, a multi-agent framework for molecular optimization that leverages tool-guided reasoning and role-specialized LLM agents. Our system incorporates comprehensive RDKit tools, categorized into five distinct domains: structural descriptors, electronic and topological features, fragment-based functional groups, molecular representations, and miscellaneous chemical properties. Each category is managed by an expert analyst agent, responsible for extracting task-relevant tools and enabling interpretable, chemically grounded feedback. MT-Mol produces molecules with tool-aligned and stepwise reasoning through the interaction between the analyst agents, a molecule-generating scientist, a reasoning-output verifier, and a reviewer agent. As a result, we show that our framework shows the state-of-the-art performance of the PMO-1K benchmark on 17 out of 23 tasks.

MT-Mol:Multi Agent System with Tool-based Reasoning for Molecular Optimization

TL;DR

MT-Mol introduces a multi-agent LLM framework for molecular optimization that grounds design in tool-guided reasoning using RDKit. By decomposing tasks into analyst, scientist, verifier, and reviewer roles, and enforcing structured, stepwise reasoning with tool-informed feedback, MT-Mol achieves state-of-the-art performance on the PMO-1K benchmark across 17 of 23 tasks while maintaining chemical interpretability. The approach demonstrates the value of explicit collaboration and domain-specific tool integration in generating chemically valid, high-quality molecules under budget-constrained settings. This framework has practical implications for transparent AI-assisted molecular design and educational dissemination of chemical design reasoning, albeit with limitations related to tooling scope and language coverage.

Abstract

Large language models (LLMs) have large potential for molecular optimization, as they can gather external chemistry tools and enable collaborative interactions to iteratively refine molecular candidates. However, this potential remains underexplored, particularly in the context of structured reasoning, interpretability, and comprehensive tool-grounded molecular optimization. To address this gap, we introduce MT-Mol, a multi-agent framework for molecular optimization that leverages tool-guided reasoning and role-specialized LLM agents. Our system incorporates comprehensive RDKit tools, categorized into five distinct domains: structural descriptors, electronic and topological features, fragment-based functional groups, molecular representations, and miscellaneous chemical properties. Each category is managed by an expert analyst agent, responsible for extracting task-relevant tools and enabling interpretable, chemically grounded feedback. MT-Mol produces molecules with tool-aligned and stepwise reasoning through the interaction between the analyst agents, a molecule-generating scientist, a reasoning-output verifier, and a reviewer agent. As a result, we show that our framework shows the state-of-the-art performance of the PMO-1K benchmark on 17 out of 23 tasks.

Paper Structure

This paper contains 39 sections, 4 figures, 11 tables.

Figures (4)

  • Figure 1: Overview of our method. Given a molecular optimization task, analyst agents analyze the prompt and outputs list of relevant RDKit functions from five categories. Top-$k$ molecules are retrieved as reference molecules for the scientist agent. Then, the scientist agent proposes a SMILES with stepwise reasoning, which the double checker validates for consistency. The reviewer finally assesses the reasoning using tool-informed descriptors and provides structured feedback. This generation and review process is repeated until the maximum number of iterations $N$ is reached. This multi-agent pipeline enables interpretable, tool-guided molecule generation with iterative refinement toward the design objective.
  • Figure 2: Example of analyst agents. Example case of five analyst agents analyzing the SMILES proposed by the scientist agent for the fexofenadine_mpo task. Each analyst agent chooses task-relevant tools: electronical and topological descriptors, miscellaneous descriptors, identifiers and representations, structural descriptors, and functional groups. The molecules at the bottom visualizes how analyst agents analyze the scientist agent's proposed SMILES. We provide the description of the tools at \ref{['appdx:tool_list']}.
  • Figure 3: Examples of structured and stepwise response. The figures illustrate examples of structured feedback mechanisms employed by our agent system for the mestranol_similarity task. (a) The verifier flags a mismatch between reasoning and SMILES and the scientist revises both for consistency. (b) The reviewer suggests reducing rotatable bonds and the scientist reflects the design, improving the score.
  • Figure 4: Top-10 AUC curves. Top-10 average AUC curves on the PMO benchmark, averaged over three random seeds. Our method consistently surpasses MOLLEO by achieving higher and faster-rising AUC curves, highlighting the effectiveness of tool-guided reasoning and multi-agent feedback in molecular optimization.