Table of Contents
Fetching ...

Asymptotically well-calibrated Bayesian $p$-value using the Kolmogorov-Smirnov statistic

Yueming Shen, Surya Tokdar

TL;DR

This work addresses the conservatism and potential non-uniformity of the posterior predictive p-value in Bayesian model checking. It develops a general criterion for asymptotic well-calibration and proves that Kolmogorov–Smirnov type statistics, including the classical KS statistic and its regression-generalized form, satisfy this criterion under contiguous alternatives. The authors show that ppp(CKS) and ppp(GKS) are asymptotically Uniform$(0,1)$ and demonstrate their finite-sample reliability and power through Gamma model and Gamma GLM simulations, including comparisons with chi-squared and score tests. The results provide robust, omnibus Bayesian model-checking tools that integrate naturally with posterior predictive workflows and extend calibration theory beyond asymptotically normal statistics, with practical applicability to common regression models such as Gamma GLMs.

Abstract

The posterior predictive $p$-value (ppp) is widely used in Bayesian model evaluation. However, due to double use of the data, the ppp may not be a valid $p$-value even in large samples: The asymptotic null distribution of the ppp can be non-uniform unless the underlying test statistic satisfies certain well-calibration conditions. Such conditions have been studied in the literature for asymptotically normal test statistics. We extend this line of work by establishing well-calibration conditions for test statistics that are not necessarily asymptotically normal. In particular, we show that Kolmogorov-Smirnov (KS)-type test statistics satisfy these conditions, such that their ppps are asymptotically well-calibrated Bayesian $p$-values. KS-type statistics are versatile, omnibus, and sensitive to model misspecifications. They apply to i.i.d. real-valued data, as well as non-identically distributed observations under regression models. Numerical experiments demonstrate that such $p$-values are well behaved in finite samples and can effectively detect a wide range of alternative models.

Asymptotically well-calibrated Bayesian $p$-value using the Kolmogorov-Smirnov statistic

TL;DR

This work addresses the conservatism and potential non-uniformity of the posterior predictive p-value in Bayesian model checking. It develops a general criterion for asymptotic well-calibration and proves that Kolmogorov–Smirnov type statistics, including the classical KS statistic and its regression-generalized form, satisfy this criterion under contiguous alternatives. The authors show that ppp(CKS) and ppp(GKS) are asymptotically Uniform and demonstrate their finite-sample reliability and power through Gamma model and Gamma GLM simulations, including comparisons with chi-squared and score tests. The results provide robust, omnibus Bayesian model-checking tools that integrate naturally with posterior predictive workflows and extend calibration theory beyond asymptotically normal statistics, with practical applicability to common regression models such as Gamma GLMs.

Abstract

The posterior predictive -value (ppp) is widely used in Bayesian model evaluation. However, due to double use of the data, the ppp may not be a valid -value even in large samples: The asymptotic null distribution of the ppp can be non-uniform unless the underlying test statistic satisfies certain well-calibration conditions. Such conditions have been studied in the literature for asymptotically normal test statistics. We extend this line of work by establishing well-calibration conditions for test statistics that are not necessarily asymptotically normal. In particular, we show that Kolmogorov-Smirnov (KS)-type test statistics satisfy these conditions, such that their ppps are asymptotically well-calibrated Bayesian -values. KS-type statistics are versatile, omnibus, and sensitive to model misspecifications. They apply to i.i.d. real-valued data, as well as non-identically distributed observations under regression models. Numerical experiments demonstrate that such -values are well behaved in finite samples and can effectively detect a wide range of alternative models.

Paper Structure

This paper contains 16 sections, 14 theorems, 61 equations, 4 figures.

Key Result

Proposition 1

For any constant $c \in \mbR$, define $A_n(c) := \{\bm{\theta} \in \Theta, \lVert\bm{\theta}-\bm{\theta}_0\rVert \le c/\sqrt{n} \}$. Let $T_n$ be a test statistic with CDF $G_n(t \mid \bm{\theta})$. If Then the ppp of $T_n$ is asymptotically well calibrated.

Figures (4)

  • Figure 1: Gamma model example. Kernel density estimates of the null distributions of the ppp(CKS) under two priors, two estimators, and four sample sizes.
  • Figure 2: Gamma model example. Kernel density estimates of distributions of the ppp under different test statistics and data-generating models.
  • Figure 3: Gamma GLM example. Kernel density estimates of the null distributions of the ppp(GKS) under two priors, two estimators, and four sample sizes.
  • Figure 4: Gamma GLM example. Kernel density estimates of distributions of the ppp under different test statistics and data-generating models.

Theorems & Definitions (34)

  • Proposition 1
  • proof
  • Lemma 1
  • proof : Proof of Lemma \ref{['lem:l1']}
  • Lemma 2
  • proof
  • Theorem 1
  • Remark 1
  • Proposition 2
  • Remark 2
  • ...and 24 more