Table of Contents
Fetching ...

Synthetic Priors

Nick Polson, Vadim Sokolov

Abstract

Bayesian inference in generalized linear models requires a prior on the coefficient vector $β$. Practitioners naturally reason about response probabilities at specific covariate values, not about abstract log-odds parameters. We develop synthetic priors: informative Bayesian priors for GLMs grounded in Good's device of imaginary observations -- the principle that every conjugate prior is equivalent to a likelihood on pseudo-data from the same exponential family. The conditional means prior of Bedrick (1996) elicits independent Beta priors on the conditional mean response at $p$ expert-chosen design points; the induced prior on $β$ is a product of binomial likelihoods at synthetic data points. Combined with Pólya-Gamma data augmentation \citep{polson2013}, the posterior admits an exact conjugate Gibbs sampler -- no tuning, no Metropolis step -- by treating the augmented dataset as a standard logistic regression. We show that ridge regression and catalytic priors \citep{huang2020} are instances of Good's device, and identify prediction-powered inference \citep{angelopoulos2023ppi} as a structural analogue in the frequentist setting -- all three mediate a variance-bias tradeoff through a single informativeness parameter. We illustrate the approach on two benchmark problems: the Challenger O-ring data \citep{dalal1989}, where the BCJ prior provides a more moderate posterior predictive at the 31°F launch temperature; and a Phase~II atopic dermatitis dose-finding trial ($n = 300$), where the synthetic prior narrows 95\% credible intervals by 3-6\% and raises decision probabilities by up to 2 percentage points relative to a flat prior.

Synthetic Priors

Abstract

Bayesian inference in generalized linear models requires a prior on the coefficient vector . Practitioners naturally reason about response probabilities at specific covariate values, not about abstract log-odds parameters. We develop synthetic priors: informative Bayesian priors for GLMs grounded in Good's device of imaginary observations -- the principle that every conjugate prior is equivalent to a likelihood on pseudo-data from the same exponential family. The conditional means prior of Bedrick (1996) elicits independent Beta priors on the conditional mean response at expert-chosen design points; the induced prior on is a product of binomial likelihoods at synthetic data points. Combined with Pólya-Gamma data augmentation \citep{polson2013}, the posterior admits an exact conjugate Gibbs sampler -- no tuning, no Metropolis step -- by treating the augmented dataset as a standard logistic regression. We show that ridge regression and catalytic priors \citep{huang2020} are instances of Good's device, and identify prediction-powered inference \citep{angelopoulos2023ppi} as a structural analogue in the frequentist setting -- all three mediate a variance-bias tradeoff through a single informativeness parameter. We illustrate the approach on two benchmark problems: the Challenger O-ring data \citep{dalal1989}, where the BCJ prior provides a more moderate posterior predictive at the 31°F launch temperature; and a Phase~II atopic dermatitis dose-finding trial (), where the synthetic prior narrows 95\% credible intervals by 3-6\% and raises decision probabilities by up to 2 percentage points relative to a flat prior.
Paper Structure (14 sections, 4 theorems, 13 equations, 1 figure, 5 tables)

This paper contains 14 sections, 4 theorems, 13 equations, 1 figure, 5 tables.

Key Result

Proposition 1

Let $L(\theta \mid y) = h(y)\exp\{\langle\theta, T(y)\rangle - A(\theta)\}$ be an exponential family likelihood with natural parameter $\theta$ and sufficient statistic $T(y)$. The conjugate prior is proportional to the likelihood of $\nu_0$ observations with sufficient statistic $t_0$. The posterior after observing $n$ data points with total sufficient statistic $T = \sum_{i=1}^n T(y_i)$ is the

Figures (1)

  • Figure 1: Atopic dermatitis dose-finding simulation, averaged over 30 trials. Left: Posterior mean dose-response curve with average 95% credible bands for BCJ (blue) and flat (orange dashed) priors; dotted black curve is the true Emax model. Right: Posterior probability of clinically meaningful benefit over placebo ($>5$ percentage points) as a function of dose; horizontal reference lines at 80% and 95%.

Theorems & Definitions (8)

  • Proposition 1: Prior--data equivalence
  • Example 1: Beta--Binomial
  • Proposition 2: Data augmentation representation, bedrick1996
  • Remark 1: Nonlinear parameters
  • Remark 2: Ridge regression as Good's device
  • Proposition 3: Pólya-Gamma Identity, polson2013
  • Proposition 4: Ridge as augmented OLS
  • proof