PiFlow: Principle-Aware Scientific Discovery with Multi-Agent Collaboration
Yingming Pu, Tao Lin, Hongyu Chen
TL;DR
PiFlow tackles inefficiencies in automated scientific discovery by introducing a principled, information-theoretic framework that treats discovery as structured uncertainty reduction guided by scientific principles. The core method, a Min-Max optimization, balances exploitation of high-potential principles with information gain to reduce epistemic uncertainty, using practical proxies like embedding-distance for scalability. The approach yields sublinear regret $O(\sqrt{T})$, and empirical results show substantial improvements in discovery efficiency and solution quality across nanohelix, molecular bio-activity, and superconductors, with notable speedups and token savings. Its plug-and-play design enables seamless integration with existing multi-agent systems, highlighting a scalable path toward robust AI-driven scientific exploration across domains.
Abstract
Large Language Model (LLM)-based multi-agent systems (MAS) demonstrate remarkable potential for scientific discovery. Existing approaches, however, often automate scientific discovery using predefined workflows that lack rationality constraints. This often leads to aimless hypothesizing and a failure to consistently link hypotheses with evidence, thereby hindering the systematic reduction of uncertainty. Overcoming these limitations fundamentally requires a principled approach to exploration. We introduce PiFlow, an information-theoretical framework, treating automated scientific discovery as a structured uncertainty reduction problem guided by principles (e.g., scientific laws). Extensive evaluations across three distinct scientific domains demonstrate that PiFlow (I) improves discovery efficiency by 31.18%~41.73% and solution quality by 12.47%~31.72% against state-of-the-art methods, (II) delivers a 5.6x speedup in time-to-solution while reducing token consumption by up to 27% compared to vanilla agents, and (III) serves as a Plug-and-Play module that generalizes on existing agent architecture. Overall, PiFlow establishes a novel paradigm shift in highly efficient agentic scientific discovery, paving the way for more robust and accelerated AI-driven research.
