Table of Contents
Fetching ...

Hypothesis Testing over Observable Regimes in Singular Models

Sean Plummer

Abstract

Hypothesis testing in singular statistical models is often regarded as inherently problematic due to non-identifiability and degeneracy of the Fisher information. We show that the fundamental obstruction to testing in such models is not singularity itself, but the formulation of hypotheses on non-identifiable parameter quantities. Testing is inherently a problem in distribution space: if two hypotheses induce overlapping subsets of the model class, then no uniformly consistent test exists. We formalize this overlap obstruction and show that hypotheses depending on non-identifiable parameter functions necessarily fail in this sense. In contrast, hypotheses formulated over identifiable observables-quantities that are determined by the induced distribution-reduce entirely to classical testing theory. When the corresponding distributional regimes are separated in Hellinger distance, uniformly consistent tests exist and posterior contraction follows from standard testing-based arguments. Near singular boundaries, separation may collapse locally, leading to scale-dependent detectability governed jointly by sample size and distance to the singular stratum. We illustrate these phenomena in Gaussian mixture models and reduced-rank regression, exhibiting both untestable non-identifiable hypotheses and classically testable identifiable ones. The results provide a structural classification of which hypotheses in singular models are statistically meaningful.

Hypothesis Testing over Observable Regimes in Singular Models

Abstract

Hypothesis testing in singular statistical models is often regarded as inherently problematic due to non-identifiability and degeneracy of the Fisher information. We show that the fundamental obstruction to testing in such models is not singularity itself, but the formulation of hypotheses on non-identifiable parameter quantities. Testing is inherently a problem in distribution space: if two hypotheses induce overlapping subsets of the model class, then no uniformly consistent test exists. We formalize this overlap obstruction and show that hypotheses depending on non-identifiable parameter functions necessarily fail in this sense. In contrast, hypotheses formulated over identifiable observables-quantities that are determined by the induced distribution-reduce entirely to classical testing theory. When the corresponding distributional regimes are separated in Hellinger distance, uniformly consistent tests exist and posterior contraction follows from standard testing-based arguments. Near singular boundaries, separation may collapse locally, leading to scale-dependent detectability governed jointly by sample size and distance to the singular stratum. We illustrate these phenomena in Gaussian mixture models and reduced-rank regression, exhibiting both untestable non-identifiable hypotheses and classically testable identifiable ones. The results provide a structural classification of which hypotheses in singular models are statistically meaningful.
Paper Structure (27 sections, 6 theorems, 39 equations, 4 figures)

This paper contains 27 sections, 6 theorems, 39 equations, 4 figures.

Key Result

Lemma 1

If $\mathcal{M}_0 \cap \mathcal{M}_1 \neq \varnothing$, then for any test $\phi_n$, Consequently, no sequence of tests can satisfy $\alpha_n(\phi_n) \to 0$ and $\beta_n(\phi_n) \to 0$.

Figures (4)

  • Figure 1: Empirical Type I error $\widehat{\alpha}_n$, Type II error $\widehat{\beta}_n$, and their sum for the non-identifiable component-ordering hypothesis in a two-component Gaussian mixture model. Across increasing sample sizes, $\widehat{\alpha}_n + \widehat{\beta}_n$ remains close to one, consistent with Proposition \ref{['prop:nonid']}.
  • Figure 2: Empirical Type I error and power for the identifiable rank hypothesis in reduced-rank regression. The smallest singular value under the alternative is bounded away from zero. Power increases with sample size, consistent with classical separation theory.
  • Figure 3: Empirical Type I error and power for the identifiable single-Gaussian versus separated-mixture hypothesis. When component separation is bounded away from zero, power increases with sample size, illustrating classical testability in a singular model.
  • Figure 4: Empirical Type I and Type II errors for the non-identifiable singular-vector sign hypothesis in reduced-rank regression. The sum of errors remains close to one across increasing sample sizes, illustrating the overlap-based impossibility result.

Theorems & Definitions (16)

  • Definition 1: Observable
  • Definition 2: Identifiable observable
  • Lemma 1: Overlap implies impossibility of uniform testing
  • proof
  • Definition 3: Non-identifiable quantity
  • Proposition 1: Non-identifiable hypotheses are untestable
  • proof
  • Definition 4: $\varepsilon$-separated alternatives
  • Lemma 2: Separated regimes admit tests
  • proof : Proof sketch
  • ...and 6 more