Table of Contents
Fetching ...

Optimization-centric cutting feedback for semiparametric models

Linda S. L. Tan, David J. Nott, David T. Frazier

Abstract

Complex statistical models are often built by combining multiple submodels, called modules. Here we consider modular inference where the modules contain both parametric and nonparametric components. In such cases, standard Bayesian inference can be highly sensitive to misspecification in any module, and influential prior specifications for the nonparametric components can compromise inference for the parametric components, and vice versa. We propose a novel "optimization-centric" approach to cutting feedback for semiparametric modular inference, which can address misspecification and prior-data conflicts. The proposed cut posteriors are defined via a variational optimization problem like other generalized posteriors, but regularization is based on Rényi divergence, instead of Kullback-Leibler divergence (KLD). We show empirically that defining the cut posterior using Rényi divergence delivers more robust inference than KLD, and Rényi divergence reduces the tendency to underestimate uncertainty when the variational approximations impose strong parametric or independence assumptions. Novel posterior concentration results that accommodate the Rényi divergence and allow for semiparametric components are derived, extending existing results for cut posteriors that only apply to KLD and parametric models. These new methods are demonstrated in a benchmark example and two real examples: Gaussian process adjustments for confounding in causal inference and misspecified copula models with nonparametric marginals.

Optimization-centric cutting feedback for semiparametric models

Abstract

Complex statistical models are often built by combining multiple submodels, called modules. Here we consider modular inference where the modules contain both parametric and nonparametric components. In such cases, standard Bayesian inference can be highly sensitive to misspecification in any module, and influential prior specifications for the nonparametric components can compromise inference for the parametric components, and vice versa. We propose a novel "optimization-centric" approach to cutting feedback for semiparametric modular inference, which can address misspecification and prior-data conflicts. The proposed cut posteriors are defined via a variational optimization problem like other generalized posteriors, but regularization is based on Rényi divergence, instead of Kullback-Leibler divergence (KLD). We show empirically that defining the cut posterior using Rényi divergence delivers more robust inference than KLD, and Rényi divergence reduces the tendency to underestimate uncertainty when the variational approximations impose strong parametric or independence assumptions. Novel posterior concentration results that accommodate the Rényi divergence and allow for semiparametric components are derived, extending existing results for cut posteriors that only apply to KLD and parametric models. These new methods are demonstrated in a benchmark example and two real examples: Gaussian process adjustments for confounding in causal inference and misspecified copula models with nonparametric marginals.

Paper Structure

This paper contains 30 sections, 6 theorems, 72 equations, 6 figures, 3 tables.

Key Result

Lemma 1

Consider a joint generalized posterior density for $\theta$ optimized over the class ${\mathcal{F}}_{\theta}=\{q(\theta) \mid q(\varphi) \in {\mathcal{F}}_{\varphi}, q(\eta\mid \varphi) \in {\mathcal{F}}_{\eta\mid \varphi} \}$, based on the divergence where $\mathcal{D}_\varphi$ and $\mathcal{D}_{\eta\mid \varphi}$ are the divergences in cut_varphi. Let $\lambda =1$ and consider the loss Then th

Figures (6)

  • Figure 1: Contour plots of joint posteriors for $\theta=(\varphi,\eta)$, with priors $\varphi\sim \text{N}(10,1)$ and $\eta\sim \text{N}(0,1)$. True value of $\theta$ is marked in red. Both the true and conventional cut posterior (approximated with $\alpha=0.999$) provide poor inference.
  • Figure 2: Simulation study. RMSE of posterior predictive mean from true value, and coverage probability of true value based on 95% credible intervals for $\omega(x)$ and $\tau(x)$.
  • Figure 3: STAR. Learning rates (left axis for $\lambda_1^\alpha$ and right axis for $\lambda_2^\alpha$), RMSE of posterior predictive mean of $\tau(x)$ from its unbiased estimate, and coverage probability of the unbiased estimate of $\tau(x)$ based on 95% credible intervals.
  • Figure 4: Discretization of marginal observations and the true and fitted marginal densities.
  • Figure 5: Discretization of observations and fitted marginal densities. 95% credible intervals are shaded in grey.
  • ...and 1 more figures

Theorems & Definitions (14)

  • Lemma 1
  • Remark 1
  • Remark 2
  • Example 1: Nonparametric quasi-likelihoods
  • Example 2: Copula modeling
  • Remark 3
  • Lemma 2
  • Lemma 3
  • Corollary 1
  • Remark 4
  • ...and 4 more