Beyond SMILES: Evaluating Agentic Systems for Drug Discovery
Edward Wijaya
TL;DR
This work interrogates whether agentic drug-discovery systems generalize beyond small-molecule workflows by evaluating six frameworks across fifteen practitioner-derived task classes. It reveals five core architectural gaps: small-molecule representation bias, absence of in vivo in silico integration, limited computational paradigms, misalignment with small biotech constraints, and single-objective optimization. A knowledge-probing study shows frontier LLMs possess peptide reasoning capabilities that current agents fail to surface due to architectural limitations, underscoring the need for integration pipelines rather than model retraining. The authors derive five design requirements for next-generation frameworks that act as computational partners, enabling multi-paradigm orchestration, modality-aware representations, in vivo data fusion, data-efficient learning, and risk-aware multi-objective optimization. These findings provide a roadmap for building agentic systems that can operate under realistic constraints and support iterative design-test cycles in diverse drug discovery contexts.
Abstract
Agentic systems for drug discovery have demonstrated autonomous synthesis planning, literature mining, and molecular design. We ask how well they generalize. Evaluating six frameworks against 15 task classes drawn from peptide therapeutics, in vivo pharmacology, and resource-constrained settings, we find five capability gaps: no support for protein language models or peptide-specific prediction, no bridges between in vivo and in silico data, reliance on LLM inference with no pathway to ML training or reinforcement learning, assumptions tied to large-pharma resources, and single-objective optimization that ignores safety-efficacy-stability trade-offs. A paired knowledge-probing experiment suggests the bottleneck is architectural rather than epistemic: four frontier LLMs reason about peptides at levels comparable to small molecules, yet no framework exposes this capability. We propose design requirements and a capability matrix for next-generation frameworks that function as computational partners under realistic constraints.
