Table of Contents
Fetching ...

Local asymptotics of selection models with applications in Bayesian selective inference

Daniel G. Rasines, G. Alastair Young

TL;DR

The paper develops a local asymptotic theory for selection models arising from conditional and information-splitting selective inference, showing that, under mild regularity, a sequence of selection models can be approximated by Gaussian selection models—a local asymptotic selective normality (LASN) framework that extends Le Cam's Local Asymptotic Normality to non-regular contexts. This LASN expansion yields practical consequences for Bayesian selective inference, including a Bernstein–von Mises type result that links selective posteriors to Gaussian selection models with a selection-dependent normalizing factor, and reveals pronounced miscalibration of Bayesian posteriors under standard priors. The results emphasize that the familiar Bayesian–frequentist equivalence breaks under selection unless priors are carefully tuned to the sample size and the selection mechanism. By providing concrete examples (deterministic selection, data carving, randomization, and winners inference) and outlining the conditions under which the theory holds, the work offers a unified theoretical foundation for analyzing and calibrating inference after selection in non-Gaussian settings. This framework has implications for methodological choices in selective inference and highlights the need for prior specifications that accommodate selection in order to achieve frequentist-like calibration.

Abstract

Contemporary focus on selective inference provokes interest in the asymptotic properties of selection models, as the working inferential models in the conditional approach to inference after selection. In this paper, we derive an asymptotic expansion of the local likelihood ratios of selection models. We show that under mild regularity conditions, they behave asymptotically like a sequence of Gaussian selection models. This generalizes the Local Asymptotic Normality framework of Le Cam (1960) to a class of non-regular models, and indicates a notion of local asymptotic selective normality as the appropriate simplifying theoretical framework for analysis of selective inference. Furthermore, we establish practical consequences for Bayesian selective inference. Specifically, we derive the asymptotic shape of Bayesian posterior distributions constructed from selection models, and show that they will typically be significantly miscalibrated in a frequentist sense, demonstrating that the familiar asymptotic equivalence between Bayesian and frequentist approaches does not hold under selection.

Local asymptotics of selection models with applications in Bayesian selective inference

TL;DR

The paper develops a local asymptotic theory for selection models arising from conditional and information-splitting selective inference, showing that, under mild regularity, a sequence of selection models can be approximated by Gaussian selection models—a local asymptotic selective normality (LASN) framework that extends Le Cam's Local Asymptotic Normality to non-regular contexts. This LASN expansion yields practical consequences for Bayesian selective inference, including a Bernstein–von Mises type result that links selective posteriors to Gaussian selection models with a selection-dependent normalizing factor, and reveals pronounced miscalibration of Bayesian posteriors under standard priors. The results emphasize that the familiar Bayesian–frequentist equivalence breaks under selection unless priors are carefully tuned to the sample size and the selection mechanism. By providing concrete examples (deterministic selection, data carving, randomization, and winners inference) and outlining the conditions under which the theory holds, the work offers a unified theoretical foundation for analyzing and calibrating inference after selection in non-Gaussian settings. This framework has implications for methodological choices in selective inference and highlights the need for prior specifications that accommodate selection in order to achieve frequentist-like calibration.

Abstract

Contemporary focus on selective inference provokes interest in the asymptotic properties of selection models, as the working inferential models in the conditional approach to inference after selection. In this paper, we derive an asymptotic expansion of the local likelihood ratios of selection models. We show that under mild regularity conditions, they behave asymptotically like a sequence of Gaussian selection models. This generalizes the Local Asymptotic Normality framework of Le Cam (1960) to a class of non-regular models, and indicates a notion of local asymptotic selective normality as the appropriate simplifying theoretical framework for analysis of selective inference. Furthermore, we establish practical consequences for Bayesian selective inference. Specifically, we derive the asymptotic shape of Bayesian posterior distributions constructed from selection models, and show that they will typically be significantly miscalibrated in a frequentist sense, demonstrating that the familiar asymptotic equivalence between Bayesian and frequentist approaches does not hold under selection.
Paper Structure (15 sections, 5 theorems, 64 equations, 5 figures, 1 table)

This paper contains 15 sections, 5 theorems, 64 equations, 5 figures, 1 table.

Key Result

Theorem 1

Suppose that $\{F_\theta\colon \theta\in \Theta\}$ is DQM, let $Z_n \equiv Z_n(\theta) = n^{-1/2} I_\theta^{-1} \sum_{i = 1}^n \triangledown l_\theta(Y_i)$, and assume that the sequence of selection functions is uniformly bounded: For a fixed $\theta\in \Theta$, assume that $M(x) = E_\theta[\exp\{ x^T \triangledown l_\theta(Y_1)\}] < \infty$ in a neighborhood of the origin, that there exists a s

Figures (5)

  • Figure 1: Blue: $r_n(h; \theta)$; black: $r_n^*(h; \theta)$. Top left: deterministic selection with $t_n = 0.5$; top right: data carving with $t_n = 0.5$; bottom left: randomization with $W\sim N(0, 1)$ and $t_n = \sqrt{n}0.5$; bottom right: conditioning on $\sqrt{n}\bar{Y}_n + W = u_n$, with $W\sim N(0, 1)$ and $u_n = 3.8$.
  • Figure 2: Top left: $p^*_n(z_1, 0; \theta)$; top right: $p^*_n(1, z_2; \theta)$; bottom left: $r_n(h_1, 0; \theta)$ (blue) and $r^*_n(h_1, 0; \theta)$ (black); bottom right: $r_n(1, h_2; \theta)$ (blue) and $r^*_n(1, h_2; \theta)$ (black), with $\theta=(0,1)$.
  • Figure 3: Realizations of the posterior densities in the exponential model (blue), their Gaussian approximations (black), and the standard, non-selective posteriors (red). Left: deterministic selection; right, randomized selection.
  • Figure 4: Realizations of the exact marginal posterior densities of $h_1$ and $h_2$ in the inference on winners model (blue) and their Gaussian approximations (black). Exact marginal 90% credible intervals shown by vertical lines.
  • Figure 5: Cumulative distribution functions of $\Pi(\theta_0\mid Y^n)$ in the exponential model (blue) with the same settings as before, and of the corresponding Gaussian posterior distributions (black). Left, $n=50$; right, $n=100$.

Theorems & Definitions (9)

  • Theorem 1
  • Lemma 1
  • Example 3.1
  • Example 3.2
  • Example 3.3
  • Example 3.4
  • Proposition 1
  • Proposition 2
  • Proposition 3