An exploration of sequential Bayesian variable selection -- A comment on García-Donato et al. (2025). "Model uncertainty and missing data: An objective Bayesian perspective"
Sebastian Arnold, Alexander Ly
TL;DR
The paper investigates sequential Bayesian variable selection under missing data by extending the GCCQF framework with Sequential Model Confidence Sets (SMCS) to monitor evidence as data accumulate. It formalizes SMCS as a time-dependent model-collection method that guarantees coverage via a bound on model-wins and derives SMCS-based inclusion probabilities for covariates, providing a bridge between frequentist sequential guarantees and Bayesian inclusion concepts. Through simulations on a linear-regression data-generating process, the study shows that SMCS can stabilize posterior inclusion behavior and reduce fluctuations, especially when combined with GCCQF (the mixed approach), albeit with some increased risk of misclassifying inactive covariates late in the sequence. The work highlights the potential of safe sequential inference for Bayesian variable selection and outlines avenues for tuning, adaptivity, and theoretical connections to Bayesian decision-making.
Abstract
Our comment on García-Donato et al. (2025). "Model uncertainty and missing data: An objective Bayesian perspective" explores a further extension of the proposed methodology. Specifically, we consider the sequential setting where (potentially missing) data accumulate over time, with the goal of continuously monitoring statistical evidence, as opposed to assessing it only once data collection terminates. We explore a new variable selection method based on sequential model confidence sets, as proposed by Arnold et al. (2024), and show that it can help stabilise the inference of García-Donato et al. (2025). To be published as "Invited discussion" in Bayesian Analysis.
