Table of Contents
Fetching ...

Inference for Median and a Generalization of HulC

Manit Paul, Arun Kumar Kuchibhotla

Abstract

Constructing distribution-free confidence intervals for the median, a classic problem in statistics, has seen numerous solutions in the literature. While coverage validity has received ample attention, less has been explored about interval width. Our study breaks new ground by investigating the width of these intervals under non-standard assumptions. Surprisingly, we find that properly scaled, the interval width converges to a non-degenerate random variable, unlike traditional intervals. We also adapt our findings for constructing improved confidence intervals for general parameters, enhancing the existing HulC procedure. These advances provide practitioners with more robust tools for data analysis, reducing the need for strict distributional assumptions.

Inference for Median and a Generalization of HulC

Abstract

Constructing distribution-free confidence intervals for the median, a classic problem in statistics, has seen numerous solutions in the literature. While coverage validity has received ample attention, less has been explored about interval width. Our study breaks new ground by investigating the width of these intervals under non-standard assumptions. Surprisingly, we find that properly scaled, the interval width converges to a non-degenerate random variable, unlike traditional intervals. We also adapt our findings for constructing improved confidence intervals for general parameters, enhancing the existing HulC procedure. These advances provide practitioners with more robust tools for data analysis, reducing the need for strict distributional assumptions.
Paper Structure (32 sections, 24 theorems, 334 equations, 10 figures, 3 algorithms)

This paper contains 32 sections, 24 theorems, 334 equations, 10 figures, 3 algorithms.

Key Result

Theorem 1

Suppose $X_1, X_2, \ldots, X_n$ are independent and identically distributed as $F$ with median $\theta_0$ i.e. $\mathbb{P}(X_i \le \theta_0) \ge 1/2$ and $\mathbb{P}(X_i \ge \theta_0) \ge 1/2$. Then the confidence interval returned by alg:proposed-conf-int satisfies the following:

Figures (10)

  • Figure 1: In the first two plots (from left-hand side) we see the plot of $c_{n,\alpha}-\sqrt{n}z_{\alpha/2}/2$ as $n$ varies ($n\geq \log_2(2/\alpha)$) for fixed $\alpha=0.01, 0.05$. In the following two plots we see the plot of $c_{n,\alpha}-\sqrt{n}z_{\alpha/2}/2$ as $\alpha$ varies ($\alpha>2^{-(n-1)}$) for fixed $n=15, 25$.
  • Figure 2: Density of the limiting distribution $\mathscr{G}(W, z_{\alpha/2})$ for different values of the level of significance $\alpha = 0.01, 0.05, 0.1$ as $\rho$ varies from $0.5$ to $10$. The densities have been trimmed at $y \in [0, 60]$ and $x \in [1.5, 3]$ for better visibility.
  • Figure 3: Density of the limiting distribution $\mathscr{G}(W, z_{\alpha/2})$ for different values of $\rho = 0.75, 2, 10$ and for different pairs $(M_-, M_+) \in \{(0.5, 0.5), (0.2, 0.8), (0.4, 0.6)\}$. The densities have been trimmed at $y \in [0, 15]$ and $x \in [0.5, 2.5]$ for better visibility.
  • Figure 4: Histogram of the scaled width $n^{1/(2\rho)}M^{1/\rho}\mathrm{Width}(\widehat{\hbox{CI}}_{n,\alpha})$ (for sample-size $n = 1000$) along with the density of the limiting distribution $\mathscr{G}(W, z_{\alpha/2})$ for different values $\rho = 0.75, 2, 10$. The densities and histograms have been trimmed at $y \in [0, 10]$ and $x \in [1.5, 3]$ for better visibility.
  • Figure 5: Comparison of the width and coverage of different types of confidence intervals (the distribution-free C.I. the sub-sample based C.I. (with estimated rate of convergence and with sub-sample size $n^{1/2}$), and the classical bootstrap based C.I. ) for different values of sample sizes ($n = 50, 200, 500, 1000$) and for different growth-rates of the distribution function on either side of the median ($\rho$ takes values from $0.2$ to $10$). The box-plots have been thresholded at $y = 40$ for better visibility, so there might be some outlying observations beyond the threshold.
  • ...and 5 more figures

Theorems & Definitions (43)

  • Theorem 1
  • Remark 1: Sample size condition and impossibility
  • Remark 2: Comparison with Wald and bootstrap confidence intervals
  • Remark 3: Coverage of quantiles close to the median
  • Theorem 2
  • Theorem 3
  • Theorem 4
  • Theorem 5
  • Theorem 6
  • Theorem 7
  • ...and 33 more