Table of Contents
Fetching ...

The $s$-value: evaluating stability with respect to distributional shifts

Suyash Gupta, Dominik Rothenhäusler

Abstract

Common statistical measures of uncertainty such as $p$-values and confidence intervals quantify the uncertainty due to sampling, that is, the uncertainty due to not observing the full population. However, sampling is not the only source of uncertainty. In practice, distributions change between locations and across time. This makes it difficult to gather knowledge that transfers across data sets. We propose a measure of instability that quantifies the distributional instability of a statistical parameter with respect to Kullback-Leibler divergence, that is, the sensitivity of the parameter under general distributional perturbations within a Kullback-Leibler divergence ball. In addition, we quantify the instability of parameters with respect to directional or variable-specific shifts. Measuring instability with respect to directional shifts can be used to detect the type of shifts a parameter is sensitive to. We discuss how such knowledge can inform data collection for improved estimation of statistical parameters under shifted distributions. We evaluate the performance of the proposed measure on real data and show that it can elucidate the distributional instability of a parameter with respect to certain shifts and can be used to improve estimation accuracy under shifted distributions.

The $s$-value: evaluating stability with respect to distributional shifts

Abstract

Common statistical measures of uncertainty such as -values and confidence intervals quantify the uncertainty due to sampling, that is, the uncertainty due to not observing the full population. However, sampling is not the only source of uncertainty. In practice, distributions change between locations and across time. This makes it difficult to gather knowledge that transfers across data sets. We propose a measure of instability that quantifies the distributional instability of a statistical parameter with respect to Kullback-Leibler divergence, that is, the sensitivity of the parameter under general distributional perturbations within a Kullback-Leibler divergence ball. In addition, we quantify the instability of parameters with respect to directional or variable-specific shifts. Measuring instability with respect to directional shifts can be used to detect the type of shifts a parameter is sensitive to. We discuss how such knowledge can inform data collection for improved estimation of statistical parameters under shifted distributions. We evaluate the performance of the proposed measure on real data and show that it can elucidate the distributional instability of a parameter with respect to certain shifts and can be used to improve estimation accuracy under shifted distributions.

Paper Structure

This paper contains 40 sections, 25 theorems, 121 equations, 7 figures, 6 tables.

Key Result

Theorem 1

Let $Z \sim P_{0}$ be a real-valued random variable with mean $\mu(P_0)=\mathbb{E}_{P_0}[Z]$ and finite moment generating function on $\mathbb{R}$. Then, we have Further, if the infimum in eqn:variational-formula is attained at some $\lambda^* \in \mathbb{R}$ then the infimum in eqn:r-mean is attained at some probability distribution $Q$ given by

Figures (7)

  • Figure 1: Distribution shift can change the parameter of interest.
  • Figure 2: The plot shows the estimated minimum and maximum value of the average treatment effect for NSW data achievable when allowing for distribution shift in some covariate (cf. equation \ref{['eq:upper-lower']}).
  • Figure 3: Parameter transfer on the NSW data set. The transfer procedure described in Section \ref{['sec:transfer-learning']} compared to a naive procedure that uses only the training distribution, and a full transfer procedure that uses data on all covariates from the new distribution. The green, red and blue bars represent performance of transfer learning with partial, full new data, and naive method respectively. Error bars show the range of error over 20 repetitions.
  • Figure 4: The plot shows the estimated minimum and maximum value of the regression coefficient for wine quality data set achievable under a distribution shift in one covariate (as defined in equation \ref{['eq:upper-lower']}).
  • Figure 5: This figure shows the effectiveness of a two-stage transfer procedure for the wine quality data set. The green, red and blue bars represent performance of transfer learning with partial, full new data, and naive method respectively. Error bars show the range of error over 20 repetitions. Transfer learning outperforms the naive method in almost all cases.
  • ...and 2 more figures

Theorems & Definitions (45)

  • Theorem 1: Theorem 5.2, DonskerVa76
  • Theorem 2
  • Example 1: Distribution with positive support
  • Example 2: Gaussian distribution
  • Example 3: Directional stability
  • Example 4: Average treatment effect
  • Lemma C.1: Consistency of $s$-value
  • Lemma C.2: Asymptotic normality of $s$-values
  • Lemma C.3: Consistency of directional $s$-value
  • Lemma C.4: Asymptotic normality of directional $s$-values
  • ...and 35 more