Table of Contents
Fetching ...

Comparing latent inequality with ordinal data

David M. Kaplan, Wei Zhao

TL;DR

This paper develops a nonparametric framework to compare two latent distributions using only ordinal data, avoiding parametric assumptions about the underlying continuous distributions. It establishes identification results for between-group inequality via latent quantiles and for within-group inequality via interquantile ranges, under threshold-shift and threshold-leq-one conditions on the latent thresholds. The authors introduce inner confidence sets for the latent-inequality constructs and provide frequentist and Bayesian inference methods, including special-case and general constructions for between- and within-group comparisons. Empirical illustrations with NHIS health data demonstrate that meaningful latent disparities can be detected even when latent means are not comparable nonparametrically, highlighting the approach’s practical relevance for health, education, and policy analysis.

Abstract

We propose new ways to compare two latent distributions when only ordinal data are available and without imposing parametric assumptions on the underlying continuous distributions. First, we contribute identification results. We show how certain ordinal conditions provide evidence of between-group inequality, quantified by particular quantiles being higher in one latent distribution than in the other. We also show how other ordinal conditions provide evidence of higher within-group inequality in one distribution than in the other, quantified by particular interquantile ranges being wider in one latent distribution than in the other. Second, we propose an "inner" confidence set for the quantiles that are higher for the first latent distribution. We also describe frequentist and Bayesian inference on features of the ordinal distributions relevant to our identification results. Our contributions are illustrated by empirical examples with mental health and general health.

Comparing latent inequality with ordinal data

TL;DR

This paper develops a nonparametric framework to compare two latent distributions using only ordinal data, avoiding parametric assumptions about the underlying continuous distributions. It establishes identification results for between-group inequality via latent quantiles and for within-group inequality via interquantile ranges, under threshold-shift and threshold-leq-one conditions on the latent thresholds. The authors introduce inner confidence sets for the latent-inequality constructs and provide frequentist and Bayesian inference methods, including special-case and general constructions for between- and within-group comparisons. Empirical illustrations with NHIS health data demonstrate that meaningful latent disparities can be detected even when latent means are not comparable nonparametrically, highlighting the approach’s practical relevance for health, education, and policy analysis.

Abstract

We propose new ways to compare two latent distributions when only ordinal data are available and without imposing parametric assumptions on the underlying continuous distributions. First, we contribute identification results. We show how certain ordinal conditions provide evidence of between-group inequality, quantified by particular quantiles being higher in one latent distribution than in the other. We also show how other ordinal conditions provide evidence of higher within-group inequality in one distribution than in the other, quantified by particular interquantile ranges being wider in one latent distribution than in the other. Second, we propose an "inner" confidence set for the quantiles that are higher for the first latent distribution. We also describe frequentist and Bayesian inference on features of the ordinal distributions relevant to our identification results. Our contributions are illustrated by empirical examples with mental health and general health.
Paper Structure (25 sections, 7 theorems, 44 equations, 3 figures, 1 table)

This paper contains 25 sections, 7 theorems, 44 equations, 3 figures, 1 table.

Key Result

Theorem 2.1

Let a:ordinala:thresh-shift hold. If there exist categories $j<k$ with $F_X(j)<F_Y(j)$ and $F_Y(k)<F_X(k)$, then $Q_X^*(\tau_2)-Q_X^*(\tau_1) < Q_Y^*(\tau_2)-Q_Y^*(\tau_1)$ for any combination of $\tau_1 \in \mathcal{T}_1=( F_X(j) , F_Y(j) ]$ and $\tau_2 \in \mathcal{T}_2=( F_Y(j) , F_X(j) ]$. If in

Figures (3)

  • Figure 1: Example latent CDFs (lines) and ordinal CDFs (shapes).
  • Figure 2: Illustration of \ref{['res:within']}.
  • Figure 3: Empirical CDFs of mental health score.

Theorems & Definitions (7)

  • Theorem 2.1
  • Corollary 2.1
  • Theorem 2.2
  • Proposition 3.1
  • Theorem 3.1
  • Theorem 3.2
  • Theorem 3.3