Table of Contents
Fetching ...

Principle-Evolvable Scientific Discovery via Uncertainty Minimization

Yingming Pu, Tao Lin, Hongyu Chen

TL;DR

PiEvo reframes automated scientific discovery as Bayesian optimization over an expanding principle space $\bar{\mathcal{P}}$, enabling anomaly-driven evolution of scientific principles. By coupling Information-Directed Sampling with Gaussian Process-based likelihoods, PiEvo selects informative hypotheses and coherently augments principles when high-surprisal evidence arises, achieving sublinear regret $\tilde{\mathcal{O}}(\sqrt{T})$ and posterior consistency. Across four diverse benchmarks, PiEvo delivers state-of-the-art solution quality ($SQ$ near $90.8$–$93.2\%$), ~83% faster convergence, and robust performance across domains and LLM backbones, including de novo discovery of physical mechanisms in nanohelix chirality. This principle-evolution framework shifts the bottleneck from static hypothesis spaces to dynamic, principled worldview refinement, enabling scalable and transferable scientific discovery.

Abstract

Large Language Model (LLM)-based scientific agents have accelerated scientific discovery, yet they often suffer from significant inefficiencies due to adherence to fixed initial priors. Existing approaches predominantly operate within a static hypothesis space, which restricts the discovery of novel phenomena, resulting in computational waste when baseline theories fail. To address this, we propose shifting the focus from searching hypotheses to evolving the underlying scientific principles. We present PiEvo, a principle-evolvable framework that treats scientific discovery as Bayesian optimization over an expanding principle space. By integrating Information-Directed Hypothesis Selection via Gaussian Process and an anomaly-driven augmentation mechanism, PiEvo enables agents to autonomously refine their theoretical worldview. Evaluation across four benchmarks demonstrates that PiEvo (1) achieves an average solution quality of up to 90.81%~93.15%, representing a 29.7%~31.1% improvement over the state-of-the-art, (2) attains an 83.3% speedup in convergence step via significantly reduced sample complexity by optimizing the compact principle space, and (3) maintains robust performance across diverse scientific domains and LLM backbones.

Principle-Evolvable Scientific Discovery via Uncertainty Minimization

TL;DR

PiEvo reframes automated scientific discovery as Bayesian optimization over an expanding principle space , enabling anomaly-driven evolution of scientific principles. By coupling Information-Directed Sampling with Gaussian Process-based likelihoods, PiEvo selects informative hypotheses and coherently augments principles when high-surprisal evidence arises, achieving sublinear regret and posterior consistency. Across four diverse benchmarks, PiEvo delivers state-of-the-art solution quality ( near ), ~83% faster convergence, and robust performance across domains and LLM backbones, including de novo discovery of physical mechanisms in nanohelix chirality. This principle-evolution framework shifts the bottleneck from static hypothesis spaces to dynamic, principled worldview refinement, enabling scalable and transferable scientific discovery.

Abstract

Large Language Model (LLM)-based scientific agents have accelerated scientific discovery, yet they often suffer from significant inefficiencies due to adherence to fixed initial priors. Existing approaches predominantly operate within a static hypothesis space, which restricts the discovery of novel phenomena, resulting in computational waste when baseline theories fail. To address this, we propose shifting the focus from searching hypotheses to evolving the underlying scientific principles. We present PiEvo, a principle-evolvable framework that treats scientific discovery as Bayesian optimization over an expanding principle space. By integrating Information-Directed Hypothesis Selection via Gaussian Process and an anomaly-driven augmentation mechanism, PiEvo enables agents to autonomously refine their theoretical worldview. Evaluation across four benchmarks demonstrates that PiEvo (1) achieves an average solution quality of up to 90.81%~93.15%, representing a 29.7%~31.1% improvement over the state-of-the-art, (2) attains an 83.3% speedup in convergence step via significantly reduced sample complexity by optimizing the compact principle space, and (3) maintains robust performance across diverse scientific domains and LLM backbones.
Paper Structure (62 sections, 12 theorems, 54 equations, 16 figures, 9 tables, 1 algorithm)

This paper contains 62 sections, 12 theorems, 54 equations, 16 figures, 9 tables, 1 algorithm.

Key Result

Theorem 3.3

Under coherent principle augmentation and a calibrated generator, the PiEvo optimization loop guarantees (formally stated in Theorem thm:main): (a) sublinear cumulative regret, where the cumulative regret follows $R(T) = \tilde{\mathcal{O}}(\sqrt{T})$, and (b) posterior consistency, where the system

Figures (16)

  • Figure 1: Illustration of principle evolution.PiEvo evolves the optimizable principle space by exploring hypothesis candidates progressively, and the system seeks a principle capable of explaining or predicting the observed phenomena (e.g., empirical observations).
  • Figure 2: Evolutionary trajectory of the principle space via Coherent Augmentation. The system guided by PiEvo operates within a restricted Active Principles; however, when high-surprisal anomalies suggest epistemic stagnation, the system triggers a discovery phase. PiEvo integrates new candidate principles ($P_{new}$) to reconcile these anomalies while maintaining consistency with the whole historical evidence through Coherent Augmentation. This iterative evolution drives the optimization trajectory toward the True Principle $P^\star$, enabling the identification of the optimal hypothesis candidate.
  • Figure 3: Guidance Injection Mechanism.PiEvo generates strategic guidance with a prompt template (see Appendix \ref{['sec:agent_prompts']}) via backend optimization: (a) Active Principles, (b) identified high-surprisal anomalies and (c) hypothesis selection, injecting it into agent's context to guide reasoning.
  • Figure 4: Trajectory comparison among baselines w/ Qwen3-32B. We plot the cumulative best solution quality over time for each task. SQ $> 100\%$ indicates surpassing the empirical reference. Our PiEvo consistently outperforms baselines across all tasks, though in challenging tasks like TMC and SPO, the gap is smaller due to the inherent searching complexity of hypotheses.
  • Figure 5: Ablations of Parameters Sensitivity. Both anomaly count threshold and anomaly threshold affect the number of principles, while sigma as noise representation affects the system's exploration capability. Warm-up rounds affect the system's initialization capability.
  • ...and 11 more figures

Theorems & Definitions (24)

  • Definition 3.1: Universal Principle Space
  • Example 3.2: High-Reward Trap (Paradigm Exploitation) in SPO
  • Theorem 3.3: Convergence and Efficiency, Informal
  • Lemma 1.1: Regret decomposition
  • proof
  • Lemma 1.2: Discrete-entropy bound for PH-regret
  • proof
  • Lemma 1.3: IDS information-ratio bound
  • proof
  • Lemma 1.4: Information accounting under coherent augmentation
  • ...and 14 more