LIDDIA: Language-based Intelligent Drug Discovery Agent
Reza Averly, Frazier N. Baker, Ian A. Watson, Xia Ning
TL;DR
LIDDiA introduces a four-component, language-based agent for autonomous pre-clinical drug discovery that grounds LLM reasoning in structure-based generation and memory. By integrating Reasoner, Executor, Evaluator, and Memory, the framework can generate, optimize, and screen de novo molecules across multiple therapeutic targets, achieving a TSR of 73.3% on 30 targets and high-quality, novel candidates in most cases. The work demonstrates strong performance against baselines, analyzes action patterns and trajectories to illustrate intelligent exploration-exploitation, and provides case studies (e.g., AR/NR3C4 and EGFR) to highlight practical strengths and safety considerations. While promising, the authors emphasize the need for human-in-the-loop validation, broader property coverage, and eventual wet-lab verification to translate in silico success into real-world therapeutics.
Abstract
Drug discovery is a long, expensive, and complex process, relying heavily on human medicinal chemists, who can spend years searching the vast space of potential therapies. Recent advances in artificial intelligence for chemistry have sought to expedite individual drug discovery tasks; however, there remains a critical need for an intelligent agent that can navigate the drug discovery process. Towards this end, we introduce LIDDIA, an autonomous agent capable of intelligently navigating the drug discovery process in silico. By leveraging the reasoning capabilities of large language models, LIDDIA serves as a low-cost and highly-adaptable tool for autonomous drug discovery. We comprehensively examine LIDDIA , demonstrating that (1) it can generate molecules meeting key pharmaceutical criteria on over 70% of 30 clinically relevant targets, (2) it intelligently balances exploration and exploitation in the chemical space, and (3) it identifies one promising novel candidate on AR/NR3C4, a critical target for both prostate and breast cancers. Code and dataset are available at https://github.com/ninglab/LIDDiA
