Table of Contents
Fetching ...

Robust Sampling for Active Statistical Inference

Puheng Li, Tijana Zrnic, Emmanuel Candès

TL;DR

This work presents robust sampling strategies for active statistical inference, which ensures that the resulting estimator is never worse than the estimator using uniform sampling, and shows that with reliable uncertainty estimates, the estimator usually outperforms standard active inference.

Abstract

Active statistical inference is a new method for inference with AI-assisted data collection. Given a budget on the number of labeled data points that can be collected and assuming access to an AI predictive model, the basic idea is to improve estimation accuracy by prioritizing the collection of labels where the model is most uncertain. The drawback, however, is that inaccurate uncertainty estimates can make active sampling produce highly noisy results, potentially worse than those from naive uniform sampling. In this work, we present robust sampling strategies for active statistical inference. Robust sampling ensures that the resulting estimator is never worse than the estimator using uniform sampling. Furthermore, with reliable uncertainty estimates, the estimator usually outperforms standard active inference. This is achieved by optimally interpolating between uniform and active sampling, depending on the quality of the uncertainty scores, and by using ideas from robust optimization. We demonstrate the utility of the method on a series of real datasets from computational social science and survey research.

Robust Sampling for Active Statistical Inference

TL;DR

This work presents robust sampling strategies for active statistical inference, which ensures that the resulting estimator is never worse than the estimator using uniform sampling, and shows that with reliable uncertainty estimates, the estimator usually outperforms standard active inference.

Abstract

Active statistical inference is a new method for inference with AI-assisted data collection. Given a budget on the number of labeled data points that can be collected and assuming access to an AI predictive model, the basic idea is to improve estimation accuracy by prioritizing the collection of labels where the model is most uncertain. The drawback, however, is that inaccurate uncertainty estimates can make active sampling produce highly noisy results, potentially worse than those from naive uniform sampling. In this work, we present robust sampling strategies for active statistical inference. Robust sampling ensures that the resulting estimator is never worse than the estimator using uniform sampling. Furthermore, with reliable uncertainty estimates, the estimator usually outperforms standard active inference. This is achieved by optimally interpolating between uniform and active sampling, depending on the quality of the uncertainty scores, and by using ideas from robust optimization. We demonstrate the utility of the method on a series of real datasets from computational social science and survey research.

Paper Structure

This paper contains 34 sections, 3 theorems, 39 equations, 15 figures, 1 table, 1 algorithm.

Key Result

Theorem 1

Suppose $\pi^{(\rho)}$ is a budget-preserving path connecting $\pi$ and $\pi^{\mathrm{unif}}$. Let $\rho^* = \mathop{\arg \min}\limits_{\rho} {\rm Var} (\hat{\theta}^{\pi^{(\rho)}})$, and suppose $\hat{\rho} = \rho^* + o_P(1)$. Then, where $\sigma_{\rho^*}^2 \leq \min\{\sigma_0^2, \sigma_1^2\}$.

Figures (15)

  • Figure 1: Effective sample size and coverage on Pew post-election survey data. We compare uniform, active, and robust active sampling, for different values of the sampling budget $n_b$. The target of inference is the approval rate of a presidential candidate. We show the mean and one standard deviation of the effective sample size estimated over $500$ trials; in each trial we independently sample the observed labels.
  • Figure 2: Effective sample size on Pew post-election survey data, for different dataset sizes used to train $f$. We compare uniform, active, and robust active sampling, for different values of the sampling budget $n_b$. The target of inference is the approval rate of a presidential candidate. We show the mean and one standard deviation (see Appendix \ref{['appendix:plots']}) of the effective sample size estimated over $500$ trials; in each trial we independently sample the observed labels.
  • Figure 3: Effective sample size (top) and coverage (bottom) on Pew post-election survey data, for varying burn-in dataset sizes with respect to different proportions of the data. We compare uniform, active, and robust active sampling, for different values of the sampling budget $n_b$. The target of inference is the approval rate of a presidential candidate. We show the mean and one standard deviation of the effective sample size estimated over $500$ trials; in each trial we independently sample the observed labels.
  • Figure 4: Effective sample size for different budget-preserving paths on Pew post-election survey data, without (left) and with (right) a robustness constraint $\mathcal{C}$. In both cases, the geometric path leads to the largest effective sample size. The target of inference is the same as in Figure \ref{['fig:election_robust']}. We show the mean and one standard deviation (see Appendix \ref{['appendix:plots']}) of the effective sample size estimated over $500$ trials; in each trial we independently sample the observed labels.
  • Figure 5: Effective sample size on US Census data, for varying burn-in dataset sizes. We compare uniform, active, and robust active sampling, for different values of the sampling budget $n_b$. The target of inference is the relationship between age and income, estimated via a linear regression. We show the mean and one standard deviation of the effective sample size estimated over $500$ trials; in each trial we independently sample the observed labels.
  • ...and 10 more figures

Theorems & Definitions (11)

  • Definition 1: Budget-preserving path
  • Example 1: Linear path
  • Example 2: Geometric path
  • Theorem 1
  • Theorem 2
  • Proposition 1
  • proof
  • Definition 2: Geodesic burago2001course
  • Example 3: Linear path
  • Example 4: Geometric path
  • ...and 1 more