Solomonoff-Inspired Hypothesis Ranking with LLMs for Prediction Under Uncertainty
Josh Barber, Rourke Young, Cameron Coombe, Will Browne
TL;DR
The paper tackles uncertainty in abstract reasoning by ranking and combining multiple LLM-generated hypotheses with a Solomonoff-inspired scoring scheme that jointly favours simplicity and data fit. It builds a finite, computable hypothesis pool, creates a per-cell weighted prediction matrix, and compares against Bayesian Model Averaging on Mini-ARC tasks. Key findings show better uncertainty calibration for the Solomonoff approach in the presence of noisy hypotheses, while BMA can yield sharper predictions when hypotheses are reliable. The work demonstrates the value of algorithmic information-theoretic priors for interpretable, robust multi-hypothesis reasoning under data sparsity, with potential extensions to robotics and larger ARC benchmarks.
Abstract
Reasoning under uncertainty is a key challenge in AI, especially for real-world tasks, where problems with sparse data demands systematic generalisation. Existing approaches struggle to balance accuracy and simplicity when evaluating multiple candidate solutions. We propose a Solomonoff-inspired method that weights LLM-generated hypotheses by simplicity and predictive fit. Applied to benchmark (Mini-ARC) tasks, our method produces Solomonoff-weighted mixtures for per-cell predictions, yielding conservative, uncertainty-aware outputs even when hypotheses are noisy or partially incorrect. Compared to Bayesian Model Averaging (BMA), Solomonoff scoring spreads probability more evenly across competing hypotheses, while BMA concentrates weight on the most likely but potentially flawed candidates. Across tasks, this highlights the value of algorithmic information-theoretic priors for interpretable, reliable multi-hypothesis reasoning under uncertainty.
