Table of Contents
Fetching ...

Some Bayesian Perspectives on Clinical Trials

Alexandra Sokolova, Vadim Sokolov, Nick Polson

TL;DR

The paper develops and demonstrates a cohesive Bayesian framework for clinical trials that unites principled prior elicitation, exact or efficient sequential design via backward induction, and decision-theoretic optimization. It introduces exact Beta-Binomial backward induction for binary endpoints, bridges to covariate-adjusted logistic models with Pólya-Gamma augmentation, and analyzes Thompson sampling, predictive stopping, and calibrated utilities. Through ECMO, CALGB 49907, and I-SPY 2 case studies, it shows how informative priors, early stopping, and adaptive enrichment can yield substantial sample-size savings, albeit with trade-offs in power and frequentist operating characteristics. The work provides practical guidance for regulatory submissions under the 2026 FDA Bayesian guidance and highlights contexts—especially rare diseases and pediatrics—where patient-sparing Bayesian designs are particularly advantageous. It emphasizes that decisions under uncertainty should maximize expected utility, with priors and sequential learning driving faster, more informative conclusions than fixed-sample designs.

Abstract

We examine three landmark clinical trials -- ECMO, CALGB~49907, and I-SPY~2 -- through a unified Bayesian framework connecting prior specification, sequential adaptation, and decision-theoretic optimisation. For ECMO, the posterior probability of treatment superiority is robust across the range of priors examined. For CALGB, predictive probability monitoring stopped enrolment at 633 instead of 1800 patients. For I-SPY~2, adaptive enrichment graduated nine of 23 arms to Phase~III. These case studies motivate a methodological contribution: exact backward induction for two-arm binary trials, where Beta-Binomial conjugacy yields closed-form transitions on the integer lattice of success counts with no quadrature. A Pólya-Gamma augmentation bridges this to covariate-adjusted logistic regression. Simulation reveals a fundamental tension: the optimal Bayesian design reduces expected sample sizes to 14--26 per arm (versus 42--100 for alternatives) but with substantially lower power. A calibrated variant embedding the declaration threshold in the terminal utility improves power while maintaining sample-size savings; varying the per-stage cost traces a power frontier for selecting the preferred operating point, with suitability highest in patient-sparing contexts such as rare diseases and paediatrics. The Pólya-Gamma Laplace approximation is validated against exact calculations (mean absolute error below 0.01). We discuss implications for the 2026 FDA draft guidance on Bayesian methodology.

Some Bayesian Perspectives on Clinical Trials

TL;DR

The paper develops and demonstrates a cohesive Bayesian framework for clinical trials that unites principled prior elicitation, exact or efficient sequential design via backward induction, and decision-theoretic optimization. It introduces exact Beta-Binomial backward induction for binary endpoints, bridges to covariate-adjusted logistic models with Pólya-Gamma augmentation, and analyzes Thompson sampling, predictive stopping, and calibrated utilities. Through ECMO, CALGB 49907, and I-SPY 2 case studies, it shows how informative priors, early stopping, and adaptive enrichment can yield substantial sample-size savings, albeit with trade-offs in power and frequentist operating characteristics. The work provides practical guidance for regulatory submissions under the 2026 FDA Bayesian guidance and highlights contexts—especially rare diseases and pediatrics—where patient-sparing Bayesian designs are particularly advantageous. It emphasizes that decisions under uncertainty should maximize expected utility, with priors and sequential learning driving faster, more informative conclusions than fixed-sample designs.

Abstract

We examine three landmark clinical trials -- ECMO, CALGB~49907, and I-SPY~2 -- through a unified Bayesian framework connecting prior specification, sequential adaptation, and decision-theoretic optimisation. For ECMO, the posterior probability of treatment superiority is robust across the range of priors examined. For CALGB, predictive probability monitoring stopped enrolment at 633 instead of 1800 patients. For I-SPY~2, adaptive enrichment graduated nine of 23 arms to Phase~III. These case studies motivate a methodological contribution: exact backward induction for two-arm binary trials, where Beta-Binomial conjugacy yields closed-form transitions on the integer lattice of success counts with no quadrature. A Pólya-Gamma augmentation bridges this to covariate-adjusted logistic regression. Simulation reveals a fundamental tension: the optimal Bayesian design reduces expected sample sizes to 14--26 per arm (versus 42--100 for alternatives) but with substantially lower power. A calibrated variant embedding the declaration threshold in the terminal utility improves power while maintaining sample-size savings; varying the per-stage cost traces a power frontier for selecting the preferred operating point, with suitability highest in patient-sparing contexts such as rare diseases and paediatrics. The Pólya-Gamma Laplace approximation is validated against exact calculations (mean absolute error below 0.01). We discuss implications for the 2026 FDA draft guidance on Bayesian methodology.
Paper Structure (39 sections, 2 theorems, 29 equations, 8 figures, 3 tables)

This paper contains 39 sections, 2 theorems, 29 equations, 8 figures, 3 tables.

Key Result

Proposition 1

In a two-arm trial with balanced allocation, Beta$(\alpha_i, \beta_i)$ priors, and per-stage cost $c > 0$, the optimal stopping rule depends on data only through the pair of success counts $(s_1, s_0)$ at each stage $k$. The backward induction eq:binary_bellman has $(k+1)^2$ states at stage $k$, for

Figures (8)

  • Figure 1: Implied prior density on the treatment effect $\delta = \theta_1 - \theta_0$ under three choices of independent priors for the success probabilities $\theta_0$ and $\theta_1$. The shaded band marks the range of typical clinical effects ($|\delta| \leq 0.15$). Jeffreys priors (red) concentrate mass near $\pm 1$; uniform priors (blue) spread mass broadly; an informative prior (grey) concentrates mass where treatment effects are known to lie.
  • Figure 2: Conditional means prior for a dose-response model. Left: the clinician specifies Beta priors on the response probability at a low dose (10 mg, blue) and a high dose (50 mg, red). Right: the induced joint prior on the logistic regression coefficients $(\beta_0, \beta_1)$ via \ref{['eq:cmp']}. The prior on coefficients is proper and concentrated, even though the clinician never reasoned about regression parameters directly.
  • Figure 3: Joint posterior for $\mu = (\mu_1, \mu_2)'$ in the hierarchical logistic-normal model \ref{['eq:multicentre']} fitted to the eight-centre topical cream data of skene1990 via Pólya-Gamma Gibbs sampling. Each grey dot is a posterior draw; the diagonal line marks $\mu_1 = \mu_2$ (equal treatment and control). The mass below the diagonal ($\Pr(\mu_1 > \mu_2 \mid \text{data}) = 0.985$) indicates strong evidence of treatment efficacy despite small per-centre samples.
  • Figure 4: Thompson sampling in a two-arm trial ($p_0 = 0.30$, $p_1 = 0.45$). Each curve shows the running proportion of patients assigned to the treatment arm across eight independent simulations. Under equal randomisation (dashed line), half of all patients receive the inferior control. Thompson sampling progressively shifts allocation toward the superior arm as evidence accumulates.
  • Figure 5: Optimal stopping regions from backward induction on the Normal model \ref{['eq:normal_model']} with $\sigma^2 = 4$, $\sigma_0^2 = 1$, and sampling cost $c = 0.005$ per patient. The upper region (green) indicates stopping in favour of treatment; the lower region (red) indicates stopping in favour of control; the middle region indicates continuing. Coloured curves show eight simulated trial paths with different true $\theta$ values, each terminating when it crosses a boundary.
  • ...and 3 more figures

Theorems & Definitions (2)

  • Proposition 1: Sufficient-statistic reduction for binary trials
  • Proposition 2: Pólya-Gamma bridge