Principle-Evolvable Scientific Discovery via Uncertainty Minimization
Yingming Pu, Tao Lin, Hongyu Chen
TL;DR
PiEvo reframes automated scientific discovery as Bayesian optimization over an expanding principle space $\bar{\mathcal{P}}$, enabling anomaly-driven evolution of scientific principles. By coupling Information-Directed Sampling with Gaussian Process-based likelihoods, PiEvo selects informative hypotheses and coherently augments principles when high-surprisal evidence arises, achieving sublinear regret $\tilde{\mathcal{O}}(\sqrt{T})$ and posterior consistency. Across four diverse benchmarks, PiEvo delivers state-of-the-art solution quality ($SQ$ near $90.8$–$93.2\%$), ~83% faster convergence, and robust performance across domains and LLM backbones, including de novo discovery of physical mechanisms in nanohelix chirality. This principle-evolution framework shifts the bottleneck from static hypothesis spaces to dynamic, principled worldview refinement, enabling scalable and transferable scientific discovery.
Abstract
Large Language Model (LLM)-based scientific agents have accelerated scientific discovery, yet they often suffer from significant inefficiencies due to adherence to fixed initial priors. Existing approaches predominantly operate within a static hypothesis space, which restricts the discovery of novel phenomena, resulting in computational waste when baseline theories fail. To address this, we propose shifting the focus from searching hypotheses to evolving the underlying scientific principles. We present PiEvo, a principle-evolvable framework that treats scientific discovery as Bayesian optimization over an expanding principle space. By integrating Information-Directed Hypothesis Selection via Gaussian Process and an anomaly-driven augmentation mechanism, PiEvo enables agents to autonomously refine their theoretical worldview. Evaluation across four benchmarks demonstrates that PiEvo (1) achieves an average solution quality of up to 90.81%~93.15%, representing a 29.7%~31.1% improvement over the state-of-the-art, (2) attains an 83.3% speedup in convergence step via significantly reduced sample complexity by optimizing the compact principle space, and (3) maintains robust performance across diverse scientific domains and LLM backbones.
