Introducing the b-value: combining unbiased and biased estimators from a sensitivity analysis perspective

Zhexiao Lin; Peter J. Bickel; Peng Ding

Introducing the b-value: combining unbiased and biased estimators from a sensitivity analysis perspective

Zhexiao Lin, Peter J. Bickel, Peng Ding

TL;DR

This paper addresses inference when combining an unbiased but noisy estimator with a biased but precise one under unknown bias. It introduces a sensitivity-analysis framework that indexes inference by a maximum relative bias $b$, and defines a b-value to indicate when combining estimators ceases to yield significant conclusions. Focusing on three canonical estimators—precision-weighted ($\widehat{\tau}_{PW}$), pretest ($\widehat{\tau}_{PT}$), and soft-thresholding ($\widehat{\tau}_{ST}$)—the authors derive sequences of confidence intervals and establish that $\widehat{\tau}_{ST}$ offers robust performance with bounded worst-case risk while remaining efficient when the bias is small. The framework extends to multivariate and multiple-estimator settings, provides explicit forms for confidence regions, and is illustrated with an empirical study on IV/OLS bounds in economics, offering practical guidance and accompanying software for practitioners.

Abstract

In empirical research, when we have multiple estimators for the same parameter of interest, a central question arises: how do we combine unbiased but less precise estimators with biased but more precise ones to improve the inference? Under this setting, the point estimation problem has attracted considerable attention. In this paper, we focus on a less studied inference question: how can we conduct valid statistical inference in such settings with unknown bias? We propose a strategy to combine unbiased and biased estimators from a sensitivity analysis perspective. We derive a sequence of confidence intervals indexed by the magnitude of the bias, which enable researchers to assess how conclusions vary with the bias levels. Importantly, we introduce the notion of the b-value, a critical value of the unknown maximum relative bias at which combining estimators does not yield a significant result. We apply this strategy to three canonical combined estimators: the precision-weighted estimator, the pretest estimator, and the soft-thresholding estimator. For each estimator, we characterize the sequence of confidence intervals and determine the bias threshold at which the conclusion changes. Based on the theory, we recommend reporting the b-value based on the soft-thresholding estimator and its associated confidence intervals, which are robust to unknown bias and achieve the lowest worst-case risk among the alternatives.

Introducing the b-value: combining unbiased and biased estimators from a sensitivity analysis perspective

TL;DR

, and defines a b-value to indicate when combining estimators ceases to yield significant conclusions. Focusing on three canonical estimators—precision-weighted (

), pretest (

), and soft-thresholding (

)—the authors derive sequences of confidence intervals and establish that

offers robust performance with bounded worst-case risk while remaining efficient when the bias is small. The framework extends to multivariate and multiple-estimator settings, provides explicit forms for confidence regions, and is illustrated with an empirical study on IV/OLS bounds in economics, offering practical guidance and accompanying software for practitioners.

Abstract

Paper Structure (44 sections, 17 theorems, 214 equations, 2 figures)

This paper contains 44 sections, 17 theorems, 214 equations, 2 figures.

Introduction
Problem setup and a review of point estimation
Problem Setup
Point estimation: a review
Confidence intervals, hypothesis testing, and the b-value
Confidence interval based on the precision-weighted estimator
Confidence interval based on the pretest estimator
Confidence interval based on the soft-thresholding estimator
Comparison of the point estimators and confidence intervals
Generalization to the dependent case
Generalization to the multivariate case
Setup
Point estimation
Confidence intervals
Generalization to multiple estimators
...and 29 more sections

Key Result

Theorem 3.1

Let $\widehat{L}_{\rm PW} = \widehat{L}_{\rm PW}(b, \zeta, \gamma) \ge 0$ denote the solution to the equation of $L$: The $\widehat{L}_{\rm PW}$ always exists and is unique. The shortest length symmetric centered confidence interval based on $\widehat{\tau}_{\rm PW}$ for $\tau$ satisfying eq:coverage is given by $[\widehat{\tau}_{\rm PW} - \widehat{L}_{\rm PW} (1+\gamma)^{-1/2} \sigma_0, \widehat

Figures (2)

Figure 1: Confidence intervals against the maximum relative bias $\lvert \Delta/\sigma_0 \rvert \le b$
Figure 2: Confidence intervals against the maximum relative bias $\lvert \Delta/\sigma_0 \rvert \le b$

Theorems & Definitions (44)

Example 1.1
Example 1.2
Definition 2.1
Definition 2.2
Remark 2.1
Theorem 3.1
Lemma 3.1
Theorem 3.2
Lemma 3.2
Theorem 3.3
...and 34 more

Introducing the b-value: combining unbiased and biased estimators from a sensitivity analysis perspective

TL;DR

Abstract

Introducing the b-value: combining unbiased and biased estimators from a sensitivity analysis perspective

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (44)