Table of Contents
Fetching ...

TinyScientist: An Interactive, Extensible, and Controllable Framework for Building Research Agents

Haofei Yu, Keyang Xuan, Fenghai Li, Kunlun Zhu, Zijie Lei, Jiaxun Zhang, Ziheng Qi, Kyle Richardson, Jiaxuan You

TL;DR

Automating scientific research with LLMs promises productivity but introduces complex, multi-stage workflows. TinyScientist offers an interactive, extensible, and controllable framework built around four core stages (Thinker, Coder, Writer, Reviewer) plus three feature components (Formatter, MCPClient, Checker) and an MCP-based tool integration approach. It provides an accessible Python package and web UI, supports multiple I/O formats, and demonstrates competitive generation quality with safety and budget controls. Evaluations show improved writing and idea quality, with tool usage amplifying performance and human oversight ensuring responsible use, suggesting broad applicability for researchers and developers.

Abstract

Automatic research with Large Language Models (LLMs) is rapidly gaining importance, driving the development of increasingly complex workflows involving multi-agent systems, planning, tool usage, code execution, and human-agent interaction to accelerate research processes. However, as more researchers and developers begin to use and build upon these tools and platforms, the complexity and difficulty of extending and maintaining such agentic workflows have become a significant challenge, particularly as algorithms and architectures continue to advance. To address this growing complexity, TinyScientist identifies the essential components of the automatic research workflow and proposes an interactive, extensible, and controllable framework that easily adapts to new tools and supports iterative growth. We provide an open-source codebase, an interactive web demonstration, and a PyPI Python package to make state-of-the-art auto-research pipelines broadly accessible to every researcher and developer.

TinyScientist: An Interactive, Extensible, and Controllable Framework for Building Research Agents

TL;DR

Automating scientific research with LLMs promises productivity but introduces complex, multi-stage workflows. TinyScientist offers an interactive, extensible, and controllable framework built around four core stages (Thinker, Coder, Writer, Reviewer) plus three feature components (Formatter, MCPClient, Checker) and an MCP-based tool integration approach. It provides an accessible Python package and web UI, supports multiple I/O formats, and demonstrates competitive generation quality with safety and budget controls. Evaluations show improved writing and idea quality, with tool usage amplifying performance and human oversight ensuring responsible use, suggesting broad applicability for researchers and developers.

Abstract

Automatic research with Large Language Models (LLMs) is rapidly gaining importance, driving the development of increasingly complex workflows involving multi-agent systems, planning, tool usage, code execution, and human-agent interaction to accelerate research processes. However, as more researchers and developers begin to use and build upon these tools and platforms, the complexity and difficulty of extending and maintaining such agentic workflows have become a significant challenge, particularly as algorithms and architectures continue to advance. To address this growing complexity, TinyScientist identifies the essential components of the automatic research workflow and proposes an interactive, extensible, and controllable framework that easily adapts to new tools and supports iterative growth. We provide an open-source codebase, an interactive web demonstration, and a PyPI Python package to make state-of-the-art auto-research pipelines broadly accessible to every researcher and developer.

Paper Structure

This paper contains 20 sections, 4 equations, 12 figures, 24 tables, 1 algorithm.

Figures (12)

  • Figure 1: Design principles for TinyScientist. We highlight the key differences between traditional research agents and TinyScientist. To enhance interactivity, TinyScientist introduces a table-based interface that helps researchers clearly express and refine their intents. For extensibility, TinyScientist adopts an MCP (Model Context Protocol) design instead of direct API wrapping, making it easy to add or replace tools. For controllability, TinyScientist includes built-in safety and budget controllers that monitor and regulate the entire workflow.
  • Figure 2: Overview of TinyScientist framework. On the left side, the diagram illustrates the class hierarchy of TinyScientist. At the top-left side, TinyScientist serves as the base class. It manages four workflow components, each responsible for a core stage of the research process. In turn, each workflow component is supported by four feature components that enhance its functionality beyond its core function. On the right side, the overall workflow and the details for each feature component are described separately.
  • Figure 3: Example of iterative interaction within the thinking stage. The upper box shows a research idea (including contents, scores, tables, and experimental plans) generated by the Thinker. The lower box allows users to provide custom instructions to refine.
  • Figure 4: Example for tabular-based interaction between stages. This shows one novelty comparison result of idea thinking. It organized the generated idea as one line within one table.
  • Figure 5: LLM-based evaluation results. We report 5-scale quality scores assigned by LLM judges (with GPT-4o) for the generated paper outputs. The evaluation covers both the biological and ML domains, and each includes (1) writing quality and (2) idea quality.
  • ...and 7 more figures