Democratizing AI scientists using ToolUniverse
Shanghua Gao, Richard Zhu, Pengwei Sui, Zhenglun Kong, Sufian Aldogom, Yepeng Huang, Ayush Noori, Reza Shamji, Krishna Parvataneni, Theodoros Tsiligkaridis, Marinka Zitnik
TL;DR
ToolUniverse presents a scalable ecosystem for constructing AI scientists by harmonizing a vast toolkit of scientific tools with a unified AI–tool interaction protocol. It introduces components for discovering, validating, composing, and optimizing tools, enabling AI models to plan, execute, and refine multi-step experiments across domains without retraining. The framework is demonstrated via a hypercholesterolemia case study that interleaves literature mining, target validation, and molecular screening to propose candidate therapeutics, while maintaining human oversight. Together, these contributions offer a reusable, governance-aware infrastructure that lowers barriers to building AI-driven scientific discovery systems and evaluating their integration with laboratory workflows.
Abstract
AI scientists are emerging computational systems that serve as collaborative partners in discovery. These systems remain difficult to build because they are bespoke, tied to rigid workflows, and lack shared environments that unify tools, data, and analyses into a common ecosystem. In genomics, unified ecosystems have transformed research by enabling interoperability, reuse, and community-driven development; AI scientists require comparable infrastructure. We present ToolUniverse, an ecosystem for building AI scientists from any language or reasoning model across open- and closed-weight models. ToolUniverse standardizes how AI scientists identify and call tools by providing more than 600 machine learning models, datasets, APIs, and scientific packages for data analysis, knowledge retrieval, and experimental design. It automatically refines tool interfaces for correct use by AI scientists, generates new tools from natural language descriptions, iteratively optimizes tool specifications, and composes tools into agentic workflows. In a case study of hypercholesterolemia, ToolUniverse was used to create an AI scientist to identify a potent analog of a drug with favorable predicted properties. The open-source ToolUniverse is available at https://aiscientist.tools.
