Tiny Moves: Game-based Hypothesis Refinement
Agnieszka Dobrowolska, Rogier Hintzen, Martin Balla, Karl Gemayel, Sabine Reichert, Thomas Charman, Jen Ning Lim, Lindsay Edwards, Anna Gogleva
TL;DR
The paper introduces The Hypothesis Game, a symbolic, move-based framework for hypothesis refinement that makes scientific reasoning explicit through a shared hypothesis state and a fixed grammar of moves. Implemented with a central LLM controller (Game Master), the minimal game is evaluated on Reactome-derived pathway tasks, showing superior error removal and precision in corruption recovery and competitive performance in reconstruction from partial cues. The work demonstrates that incremental, interpretable edits can improve transferability and controllability of AI-driven scientific discovery, while also outlining clear avenues for richer representations and learned controllers. Overall, game-based reasoning offers a principled route to more interpretable, reusable, and robust hypothesis refinement systems for scientific progress.
Abstract
Most machine learning approaches to scientific discovery frame hypotheses as end-to-end predictions, obscuring the incremental structure of scientific reasoning. We propose The Hypothesis Game, a symbolic formalism for hypothesis refinement in which LLM agents operate on a shared hypothesis state using a fixed grammar of reasoning moves. The framework is motivated by the observation that scientific progress often proceeds through small, localized revisions, grounded in domain context, rather than extensive rewrites. We instantiate a minimal game with LLM agents and evaluate it on pathway-level mechanistic refinement tasks. In the primary setting of corruption recovery, where hypotheses contain controlled errors, the game-based approach consistently removes more errors and achieves higher precision than strong prompting baselines, while preserving valid structure through incremental edits. In a secondary reconstruction setting from partial cues, it performs comparably to the strongest baseline, indicating that explicit move-based refinement remains competitive even when ground-truth recovery is difficult. These findings support game-based reasoning as a principled route to more controllable, interpretable, and transferable hypothesis refinement systems for scientific discovery.
