Table of Contents
Fetching ...

DrugAgent: Automating AI-aided Drug Discovery Programming through LLM Multi-Agent Collaboration

Sizhe Liu, Yizhou Lu, Siyu Chen, Xiyang Hu, Jieyu Zhao, Yingzhou Lu, Yue Zhao

TL;DR

DrugAgent addresses the gap between theoretical AI ideas and robust drug-discovery implementations by deploying two specialized LLM agents to automate ML programming with domain knowledge. The Planner generates multiple solution ideas, while the Instructor translates them into code using curated domain docs and tools, enabling robust end-to-end workflows. Across three representative tasks (ADMET, HTS, DTI), DrugAgent outperforms general baselines and approaches expert-written methods, notably achieving a 4.92% ROC-AUC improvement on DTI over ReAct and showing effective ablations of Planner and Instructor. This work demonstrates the practical potential of domain-aware, multi-agent AI systems to accelerate drug-discovery pipelines, while acknowledging scope and safety limitations and the need for future human-in-the-loop safeguards.

Abstract

Recent progress in Large Language Models (LLMs) has drawn attention to their potential for accelerating drug discovery. However, a central problem remains: translating theoretical ideas into robust implementations in the highly specialized context of pharmaceutical research. This limitation prevents practitioners from making full use of the latest AI developments in drug discovery. To address this challenge, we introduce DrugAgent, a multi-agent framework that automates machine learning (ML) programming for drug discovery tasks. DrugAgent employs an LLM Planner that formulates high-level ideas and an LLM Instructor that identifies and integrates domain knowledge when implementing those ideas. We present case studies on three representative drug discovery tasks. Our results show that DrugAgent consistently outperforms leading baselines, including a relative improvement of 4.92% in ROC-AUC compared to ReAct for drug-target interaction (DTI). DrugAgent is publicly available at https://anonymous.4open.science/r/drugagent-5C42/.

DrugAgent: Automating AI-aided Drug Discovery Programming through LLM Multi-Agent Collaboration

TL;DR

DrugAgent addresses the gap between theoretical AI ideas and robust drug-discovery implementations by deploying two specialized LLM agents to automate ML programming with domain knowledge. The Planner generates multiple solution ideas, while the Instructor translates them into code using curated domain docs and tools, enabling robust end-to-end workflows. Across three representative tasks (ADMET, HTS, DTI), DrugAgent outperforms general baselines and approaches expert-written methods, notably achieving a 4.92% ROC-AUC improvement on DTI over ReAct and showing effective ablations of Planner and Instructor. This work demonstrates the practical potential of domain-aware, multi-agent AI systems to accelerate drug-discovery pipelines, while acknowledging scope and safety limitations and the need for future human-in-the-loop safeguards.

Abstract

Recent progress in Large Language Models (LLMs) has drawn attention to their potential for accelerating drug discovery. However, a central problem remains: translating theoretical ideas into robust implementations in the highly specialized context of pharmaceutical research. This limitation prevents practitioners from making full use of the latest AI developments in drug discovery. To address this challenge, we introduce DrugAgent, a multi-agent framework that automates machine learning (ML) programming for drug discovery tasks. DrugAgent employs an LLM Planner that formulates high-level ideas and an LLM Instructor that identifies and integrates domain knowledge when implementing those ideas. We present case studies on three representative drug discovery tasks. Our results show that DrugAgent consistently outperforms leading baselines, including a relative improvement of 4.92% in ROC-AUC compared to ReAct for drug-target interaction (DTI). DrugAgent is publicly available at https://anonymous.4open.science/r/drugagent-5C42/.

Paper Structure

This paper contains 32 sections, 3 figures, 4 tables.

Figures (3)

  • Figure 1: Overview of the DrugAgent framework. Given a drug discovery task described in natural language (i.e., user's input, e.g., design an AI model to predict Absorption (one of the ADMET properties) using the PAMPA dataset Siramshetty2021), the LLM Planner collaborates with the LLM Instructor to iteratively search for actionable, high-performing solutions.
  • Figure 2: Percentage of runs over DAVIS (DTI) dataset that falls into different error modes.
  • Figure 3: Comparison of ReAct and DrugAgent on a DTI task. (a) ReAct, a general-purpose framework, delivers lower performance due to a lack of idea diversification and failure to recognize and incorporate domain knowledge. (b) DrugAgent systematically explores a variety of approaches, successfully identifying optimal models and preprocessing methods to achieve strong performance.