Online and Interactive Bayesian Inference Debugging
Nathanael Nussbaumer, Markus Böck, Jürgen Cito
TL;DR
The paper tackles the difficulty of debugging Bayesian inference in probabilistic programming by introducing InferLog Holmes, an online and interactive debugger tailored to MCMC workflows. It integrates model visualization, live inference diagnostics, and contextualized warnings within a VSCode extension, enabling real-time analysis during inference. A controlled study with 18 participants shows that the tool increases issue resolution and speeds up identification and iteration, especially on more complex tasks, while reducing overall time spent waiting on inference. The work contributes six design requirements, an open-source implementation, and empirical validation demonstrating that online Bayesian inference debugging can meaningfully accelerate development and improve practitioner success. This approach advances practical debugging in probabilistic programming and lays groundwork for broader PPL integration and future enhancements.
Abstract
Probabilistic programming is a rapidly developing programming paradigm which enables the formulation of Bayesian models as programs and the automation of posterior inference. It facilitates the development of models and conducting Bayesian inference, which makes these techniques available to practitioners from multiple fields. Nevertheless, probabilistic programming is notoriously difficult as identifying and repairing issues with inference requires a lot of time and deep knowledge. Through this work, we introduce a novel approach to debugging Bayesian inference that reduces time and required knowledge significantly. We discuss several requirements a Bayesian inference debugging framework has to fulfill, and propose a new tool that meets these key requirements directly within the development environment. We evaluate our results in a study with 18 experienced participants and show that our approach to online and interactive debugging of Bayesian inference significantly reduces time and difficulty on inference debugging tasks.
