AIGS: Generating Science from AI-Powered Automated Falsification

Zijun Liu; Kaiming Liu; Yiqi Zhu; Xuanyu Lei; Zonghan Yang; Zhenhe Zhang; Peng Li; Yang Liu

AIGS: Generating Science from AI-Powered Automated Falsification

Zijun Liu, Kaiming Liu, Yiqi Zhu, Xuanyu Lei, Zonghan Yang, Zhenhe Zhang, Peng Li, Yang Liu

TL;DR

This work defines AI-Generated Science (AIGS) and centers falsification as the core mechanism for scientific discovery, proposing Baby-AIGS as a practical, baby-step toward fully autonomous end-to-end AIGS. It introduces a three-agent stack (ProposalAgent, ReviewAgent, FalsificationAgent) and a Domain-Specific Language (DSL) to translate ideas into executable experiments, paired with a multi-sampling strategy to enhance exploration. Through three ML-focused experiments (data engineering, self-instruct alignment, language modeling), Baby-AIGS demonstrates meaningfully autonomous discovery and iterative creativity, while recognizing that current performance lags behind expert researchers and that robust falsification remains challenging. The paper also discusses actionable limitations, ethical considerations, and a roadmap for expanding AIGS toward broader scientific domains and responsible deployment.

Abstract

Rapid development of artificial intelligence has drastically accelerated the development of scientific discovery. Trained with large-scale observation data, deep neural networks extract the underlying patterns in an end-to-end manner and assist human researchers with highly-precised predictions in unseen scenarios. The recent rise of Large Language Models (LLMs) and the empowered autonomous agents enable scientists to gain help through interaction in different stages of their research, including but not limited to literature review, research ideation, idea implementation, and academic writing. However, AI researchers instantiated by foundation model empowered agents with full-process autonomy are still in their infancy. In this paper, we study $\textbf{AI-Generated Science}$ (AIGS), where agents independently and autonomously complete the entire research process and discover scientific laws. By revisiting the definition of scientific research, we argue that $\textit{falsification}$ is the essence of both human research process and the design of an AIGS system. Through the lens of falsification, prior systems attempting towards AI-Generated Science either lack the part in their design, or rely heavily on existing verification engines that narrow the use in specialized domains. In this work, we propose Baby-AIGS as a baby-step demonstration of a full-process AIGS system, which is a multi-agent system with agents in roles representing key research process. By introducing FalsificationAgent, which identify and then verify possible scientific discoveries, we empower the system with explicit falsification. Experiments on three tasks preliminarily show that Baby-AIGS could produce meaningful scientific discoveries, though not on par with experienced human researchers. Finally, we discuss on the limitations of current Baby-AIGS, actionable insights, and related ethical issues in detail.

AIGS: Generating Science from AI-Powered Automated Falsification

TL;DR

Abstract

AIGS: Generating Science from AI-Powered Automated Falsification

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (12)