Some Bayesian Perspectives on Clinical Trials
Alexandra Sokolova, Vadim Sokolov, Nick Polson
TL;DR
The paper develops and demonstrates a cohesive Bayesian framework for clinical trials that unites principled prior elicitation, exact or efficient sequential design via backward induction, and decision-theoretic optimization. It introduces exact Beta-Binomial backward induction for binary endpoints, bridges to covariate-adjusted logistic models with Pólya-Gamma augmentation, and analyzes Thompson sampling, predictive stopping, and calibrated utilities. Through ECMO, CALGB 49907, and I-SPY 2 case studies, it shows how informative priors, early stopping, and adaptive enrichment can yield substantial sample-size savings, albeit with trade-offs in power and frequentist operating characteristics. The work provides practical guidance for regulatory submissions under the 2026 FDA Bayesian guidance and highlights contexts—especially rare diseases and pediatrics—where patient-sparing Bayesian designs are particularly advantageous. It emphasizes that decisions under uncertainty should maximize expected utility, with priors and sequential learning driving faster, more informative conclusions than fixed-sample designs.
Abstract
We examine three landmark clinical trials -- ECMO, CALGB~49907, and I-SPY~2 -- through a unified Bayesian framework connecting prior specification, sequential adaptation, and decision-theoretic optimisation. For ECMO, the posterior probability of treatment superiority is robust across the range of priors examined. For CALGB, predictive probability monitoring stopped enrolment at 633 instead of 1800 patients. For I-SPY~2, adaptive enrichment graduated nine of 23 arms to Phase~III. These case studies motivate a methodological contribution: exact backward induction for two-arm binary trials, where Beta-Binomial conjugacy yields closed-form transitions on the integer lattice of success counts with no quadrature. A Pólya-Gamma augmentation bridges this to covariate-adjusted logistic regression. Simulation reveals a fundamental tension: the optimal Bayesian design reduces expected sample sizes to 14--26 per arm (versus 42--100 for alternatives) but with substantially lower power. A calibrated variant embedding the declaration threshold in the terminal utility improves power while maintaining sample-size savings; varying the per-stage cost traces a power frontier for selecting the preferred operating point, with suitability highest in patient-sparing contexts such as rare diseases and paediatrics. The Pólya-Gamma Laplace approximation is validated against exact calculations (mean absolute error below 0.01). We discuss implications for the 2026 FDA draft guidance on Bayesian methodology.
