Synthetic Priors

Nick Polson; Vadim Sokolov

Synthetic Priors

Nick Polson, Vadim Sokolov

Abstract

Bayesian inference in generalized linear models requires a prior on the coefficient vector $β$. Practitioners naturally reason about response probabilities at specific covariate values, not about abstract log-odds parameters. We develop synthetic priors: informative Bayesian priors for GLMs grounded in Good's device of imaginary observations -- the principle that every conjugate prior is equivalent to a likelihood on pseudo-data from the same exponential family. The conditional means prior of Bedrick (1996) elicits independent Beta priors on the conditional mean response at $p$ expert-chosen design points; the induced prior on $β$ is a product of binomial likelihoods at synthetic data points. Combined with Pólya-Gamma data augmentation \citep{polson2013}, the posterior admits an exact conjugate Gibbs sampler -- no tuning, no Metropolis step -- by treating the augmented dataset as a standard logistic regression. We show that ridge regression and catalytic priors \citep{huang2020} are instances of Good's device, and identify prediction-powered inference \citep{angelopoulos2023ppi} as a structural analogue in the frequentist setting -- all three mediate a variance-bias tradeoff through a single informativeness parameter. We illustrate the approach on two benchmark problems: the Challenger O-ring data \citep{dalal1989}, where the BCJ prior provides a more moderate posterior predictive at the 31°F launch temperature; and a Phase~II atopic dermatitis dose-finding trial ($n = 300$), where the synthetic prior narrows 95\% credible intervals by 3-6\% and raises decision probabilities by up to 2 percentage points relative to a flat prior.

Synthetic Priors

Abstract

Bayesian inference in generalized linear models requires a prior on the coefficient vector

. Practitioners naturally reason about response probabilities at specific covariate values, not about abstract log-odds parameters. We develop synthetic priors: informative Bayesian priors for GLMs grounded in Good's device of imaginary observations -- the principle that every conjugate prior is equivalent to a likelihood on pseudo-data from the same exponential family. The conditional means prior of Bedrick (1996) elicits independent Beta priors on the conditional mean response at

expert-chosen design points; the induced prior on

is a product of binomial likelihoods at synthetic data points. Combined with Pólya-Gamma data augmentation \citep{polson2013}, the posterior admits an exact conjugate Gibbs sampler -- no tuning, no Metropolis step -- by treating the augmented dataset as a standard logistic regression. We show that ridge regression and catalytic priors \citep{huang2020} are instances of Good's device, and identify prediction-powered inference \citep{angelopoulos2023ppi} as a structural analogue in the frequentist setting -- all three mediate a variance-bias tradeoff through a single informativeness parameter. We illustrate the approach on two benchmark problems: the Challenger O-ring data \citep{dalal1989}, where the BCJ prior provides a more moderate posterior predictive at the 31°F launch temperature; and a Phase~II atopic dermatitis dose-finding trial (

), where the synthetic prior narrows 95\% credible intervals by 3-6\% and raises decision probabilities by up to 2 percentage points relative to a flat prior.

Paper Structure (14 sections, 4 theorems, 13 equations, 1 figure, 5 tables)

This paper contains 14 sections, 4 theorems, 13 equations, 1 figure, 5 tables.

Introduction
Synthetic Priors via Conditional Predictive Means
Good's Device and the BCJ Prior
Pólya-Gamma Data Augmentation
Connections with Previous Work
Ridge Regression: The Gaussian Linear Case
Catalytic Priors and Prediction-Powered Inference
The Control Variate Perspective
Application: Atopic Dermatitis Phase II Dose-Finding
Clinical Setting
BCJ Prior Specification
Results
O-Ring Failure: Reproducing the BCJ Benchmark
Discussion

Key Result

Proposition 1

Let $L(\theta \mid y) = h(y)\exp\{\langle\theta, T(y)\rangle - A(\theta)\}$ be an exponential family likelihood with natural parameter $\theta$ and sufficient statistic $T(y)$. The conjugate prior is proportional to the likelihood of $\nu_0$ observations with sufficient statistic $t_0$. The posterior after observing $n$ data points with total sufficient statistic $T = \sum_{i=1}^n T(y_i)$ is the

Figures (1)

Figure 1: Atopic dermatitis dose-finding simulation, averaged over 30 trials. Left: Posterior mean dose-response curve with average 95% credible bands for BCJ (blue) and flat (orange dashed) priors; dotted black curve is the true Emax model. Right: Posterior probability of clinically meaningful benefit over placebo ($>5$ percentage points) as a function of dose; horizontal reference lines at 80% and 95%.

Theorems & Definitions (8)

Proposition 1: Prior--data equivalence
Example 1: Beta--Binomial
Proposition 2: Data augmentation representation, bedrick1996
Remark 1: Nonlinear parameters
Remark 2: Ridge regression as Good's device
Proposition 3: Pólya-Gamma Identity, polson2013
Proposition 4: Ridge as augmented OLS
proof

Synthetic Priors

Abstract

Synthetic Priors

Authors

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (8)