Infer-and-widen, or not?

Ronan Perry; Zichun Xu; Olivia McGough; Daniela Witten

Infer-and-widen, or not?

Ronan Perry, Zichun Xu, Olivia McGough, Daniela Witten

Abstract

In recent years, there has been substantial interest in the task of selective inference: inference on a parameter that is selected from the data. Many of the existing proposals fall into what we refer to as the \emph{infer-and-widen} framework: they produce symmetric confidence intervals whose midpoints do not account for selection and therefore are biased; thus, the intervals must be wide enough to account for this bias. In this paper, we investigate infer-and-widen approaches in three vignettes: the winner's curse, maximal contrasts, and inference after the lasso. In each of these examples, we show that a state-of-the-art infer-and-widen proposal leads to confidence intervals that are wider than a non-infer-and-widen alternative. Furthermore, even an ``oracle'' infer-and-widen confidence interval -- the narrowest possible interval that could be theoretically attained via infer-and-widen -- can be wider than the alternative.

Infer-and-widen, or not?

Abstract

Paper Structure (11 sections, 5 theorems, 27 equations, 3 figures)

This paper contains 11 sections, 5 theorems, 27 equations, 3 figures.

Introduction
The infer-and-widen framework
Other approaches for valid selective inference
Contributions and organization
Formalizing the infer-and-widen framework
Vignette #1: The winner's curse
Vignette #2: Maximal contrasts
Vignette #3: Inference after the lasso
The bias of infer-and-widen intervals
Discussion
Deferred notation

Key Result

proposition 1

Under eq:vignette-1, consider the selection rule $\SelL(\cdot)$ defined in eq:vignette-1-selection-laplace. For any $\eta > 0$ and $\nu \in (0, \alpha)$ such that $\eta c \geq 2 z_{1 - \alpha (\alpha - \nu)/2n }$, where $z_q$ denotes the $q$th quantile of the standard normal distribution, the infer- has $1-\alpha$ unconditional coverage for $\mu_{\SelL(Y)}$ in the sense of eq:infer_and_widen_ci.

Figures (3)

Figure 1: Vignette #1 under \ref{['eq:vignette-1']} with $\mu=0$, $\sigma=1$, and the selection rule in \ref{['eq:vignette-1-selection-laplace']}. Results are averaged over (a) $250$ or (b, c) $1000$ simulated datasets. (a): We display the ratio of the average widths of the infer-and-widen (IW) \ref{['eq:vignette-1-CI-LAS']} and data fission (DF) \ref{['eq:vignette-1-CI-LDF']} intervals with $95\%$ coverage across values of the sample size $n$ and Laplacian noise variance $2c^2$. A width ratio that exceeds one implies that the infer-and-widen interval is wider. (b, c): Fixing $n=100$, we display the average width of the "oracle" infer-and-widen interval and the data fission interval \ref{['eq:vignette-1-CI-LDF']} as a function of coverage. The 95% classical interval and the "attainable" 95% infer-and-widen interval based on algorithmic stability \ref{['eq:vignette-1-CI-LAS']} are also displayed. The Laplacian noise variance is low and high in panels (b) and (c), respectively.
Figure 2: Vignette #2 under \ref{['eq:vignette-1']} with $n=100$, $\sigma=1$, $\mu=X \phi$, and the selection rule in \ref{['eq:vignette-2-selection-laplace']}, where $X$ is multivariate normal with column correlation $0.5$, and $\phi$ is sparse with exponentially distributed non-zero elements. Results are averaged over (a) $250$ or (b, c) $1000$ simulated datasets. (a): We display the ratio of the average widths of infer-and-widen (IW) \ref{['eq:vignette-2-CI-LAS']} and randomized conditional selective inference (RCSI) \ref{['eq:vignette-2-CI-LDF']} intervals with $95\%$ coverage across the number of features $p$ and Laplacian noise variance $2c^2$. (b, c): Fixing $p=100$, we display the average width of the "oracle" infer-and-widen interval and the RCSI interval \ref{['eq:vignette-2-CI-LDF']} as a function of coverage. The 95% classical interval and the "attainable" 95% infer-and-widen interval \ref{['eq:vignette-2-CI-LAS']} based on algorithmic stability are also displayed. The Laplacian noise variance is low and high in panels (b) and (c), respectively.
Figure 3: Vignette #3 under \ref{['eq:vignette-1']} with $n=100$, $p=10$, $\sigma=1$, $\mu=X \beta$, and the selection rule in \ref{['eq:vignette-3-selection']}, where the rows of $X$ are equi-correlated normally distributed, and $\beta$ has five nonzero elements. Results are averaged over $100$ simulated datasets. (a): We display the ratio of the average widths of $95\%$ confidence intervals from the LSI infer-and-widen method zrnic_locally_2024 and the hybrid method mccloskey_hybrid_2024. A width ratio that exceeds one implies that the infer-and-widen interval is wider. (b, c): Fixing $\rho = 0.9$, we compare the average width of the "oracle" infer-and-widen (IW) interval to the conditional lee_exact_2016 and hybrid mccloskey_hybrid_2024 intervals. For reference, we also display the average empirical widths and coverages of the $95\%$ intervals from the classic, simultaneous PoSI (SI) berk_valid_2013, and LSI zrnic_locally_2024 approaches. Panels (b) and (c) contain zero and nonzero signal, respectively.

Theorems & Definitions (5)

proposition 1: An infer-and-widen interval based on \ref{['eq:vignette-1-selection-laplace']}, given by zrnic_post-selection_2023
proposition 2: A data fission interval based on \ref{['eq:vignette-1-selection-laplace']}
proposition 3: An infer-and-widen interval based on \ref{['eq:vignette-2-selection-laplace']}, given by zrnic_post-selection_2023
proposition 4: A randomized conditional selective inference interval based on \ref{['eq:vignette-2-selection-laplace']}
proposition 5: Bounding the bias of the infer-and-widen midpoint

Infer-and-widen, or not?

Abstract

Infer-and-widen, or not?

Authors

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (5)