Cumulative differences between paired samples
Isabel Kloumann, Hannah Korevaar, Chris McConnell, Mark Tygert, Jessica Zhao
TL;DR
The paper tackles detecting differences between two paired populations conditioned on an ordinal covariate by introducing a fully nonparametric cumulative framework. It builds graphs of cumulative weighted differences $C_k$ versus abscissae $A_k$ and uses the Kuiper metric $D$ to summarize overall differences, avoiding binning and model-related biases. The authors show that this approach outperforms traditional reliability diagrams and extends naturally to multiple covariates via Hilbert space-filling curves, with statistical significance assessed through a driftless random-walk null and an estimator $\sigma^2$. Applications to synthetic data, the KDD Cup 1998 donor data, and the American Community Survey demonstrate the method’s ability to reveal structured, covariate-specific differences and provide interpretable, robust metrics.
Abstract
The simplest, most common paired samples consist of observations from two populations, with each observed response from one population corresponding to an observed response from the other population at the same value of an ordinal covariate. The pair of observed responses (one from each population) at the same value of the covariate is known as a "matched pair" (with the matching based on the value of the covariate). A graph of cumulative differences between the two populations reveals differences in responses as a function of the covariate. Indeed, the slope of the secant line connecting two points on the graph becomes the average difference over the wide interval of values of the covariate between the two points; i.e., slope of the graph is the average difference in responses. ("Average" refers to the weighted average if the samples are weighted.) Moreover, a simple statistic known as the Kuiper metric summarizes into a single scalar the overall differences over all values of the covariate. The Kuiper metric is the absolute value of the total difference in responses between the two populations, totaled over the interval of values of the covariate for which the absolute value of the total is greatest. The total should be normalized such that it becomes the (weighted) average over all values of the covariate when the interval over which the total is taken is the entire range of the covariate (i.e., the sum for the total gets divided by the total number of observations, if the samples are unweighted, or divided by the total weight, if the samples are weighted). This cumulative approach is fully nonparametric and uniquely defined (with only one right way to construct the graphs and scalar summary statistics), unlike traditional methods such as reliability diagrams or parametric or semi-parametric regressions, which typically obscure significant differences due to their parameter settings.
