Table of Contents
Fetching ...

On a divergence-based prior analysis of stick-breaking processes

José A. Perusquía, Mario Diaz, Ramsés H. Mena

TL;DR

The paper investigates how divergent stick-breaking priors, including the Dirichlet process and the geometric process, differ in their induced random measures by deriving mean and variance results for the Kullback-Leibler divergence between priors. By analyzing both uncoupled and coupled length-variable scenarios and extending to exchangeable length variables driven by a Dirichlet process, the authors quantify when complex SBPs are close to simpler ones in total variation. Key contributions include closed-form-like expressions and bounds for D_KL, continuity and limiting results as hyperparameters vary, and practical guidance for prior selection via Occam's razor. The work provides a rigorous, quantitative framework to balance prior flexibility against computational and mathematical tractability in Bayesian nonparametrics.

Abstract

The nonparametric view of Bayesian inference has transformed statistics and many of its applications. The canonical Dirichlet process and other more general families of nonparametric priors have served as a gateway to solve frontier uncertainty quantification problems of large, or infinite, nature. This success has been greatly due to available constructions and representations of such distributions, the two most useful constructions are the one based on normalization of homogeneous completely random measures and that based on stick-breaking processes. Hence, understanding their distributional features and how different random probability measures compare among themselves is a key ingredient for their proper application. In this paper, we analyse the discrepancy among some nonparametric priors employed in the literature. Initially, we compute the mean and variance of the random Kullback-Leibler divergence between the Dirichlet process and the geometric process. Subsequently, we extend our analysis to encompass a broader class of exchangeable stick-breaking processes, which includes the Dirichlet and geometric processes as extreme cases. Our results establish quantitative conditions where all the aforementioned priors are close in total variation distance. In such instances, adhering to Occam's razor principle advocates for the preference of the simpler process.

On a divergence-based prior analysis of stick-breaking processes

TL;DR

The paper investigates how divergent stick-breaking priors, including the Dirichlet process and the geometric process, differ in their induced random measures by deriving mean and variance results for the Kullback-Leibler divergence between priors. By analyzing both uncoupled and coupled length-variable scenarios and extending to exchangeable length variables driven by a Dirichlet process, the authors quantify when complex SBPs are close to simpler ones in total variation. Key contributions include closed-form-like expressions and bounds for D_KL, continuity and limiting results as hyperparameters vary, and practical guidance for prior selection via Occam's razor. The work provides a rigorous, quantitative framework to balance prior flexibility against computational and mathematical tractability in Bayesian nonparametrics.

Abstract

The nonparametric view of Bayesian inference has transformed statistics and many of its applications. The canonical Dirichlet process and other more general families of nonparametric priors have served as a gateway to solve frontier uncertainty quantification problems of large, or infinite, nature. This success has been greatly due to available constructions and representations of such distributions, the two most useful constructions are the one based on normalization of homogeneous completely random measures and that based on stick-breaking processes. Hence, understanding their distributional features and how different random probability measures compare among themselves is a key ingredient for their proper application. In this paper, we analyse the discrepancy among some nonparametric priors employed in the literature. Initially, we compute the mean and variance of the random Kullback-Leibler divergence between the Dirichlet process and the geometric process. Subsequently, we extend our analysis to encompass a broader class of exchangeable stick-breaking processes, which includes the Dirichlet and geometric processes as extreme cases. Our results establish quantitative conditions where all the aforementioned priors are close in total variation distance. In such instances, adhering to Occam's razor principle advocates for the preference of the simpler process.
Paper Structure (15 sections, 10 theorems, 114 equations, 4 figures)

This paper contains 15 sections, 10 theorems, 114 equations, 4 figures.

Key Result

Lemma 1

If $P$ and $Q$ are two probability measures defined on a measurable space $(\varOmega,\mathscr{F})$, then

Figures (4)

  • Figure 1: Variance of the Kullback-Leibler divergence of the Dirichlet process $(\theta)$ with respect to the geometric process $(1,\theta)$ as a function of $\theta$.
  • Figure 2: Comparison of the variances of the Kullback-Leibler divergence of the uncoupled (red line) and coupled Dirichlet and geometric process (blue dots), and where the black dashed line represents the limit as $\theta\to\infty$ of both variances.
  • Figure 3: Expectation of the Kullback-Leibler divergence of $P_{\beta}$ with respect to $P$ for different values of $\beta$ where the red line is equal to $\theta(\theta+1)^{-1}$ for $\theta=5$.
  • Figure 4: Expectation of the Kullback-Leibler divergence of $P_{\beta}$ with respect to $P$ (black line) for $\beta\in(\theta+1,50)$ and $\theta=1$, where the red line represents the limiting behaviour of the expectation as $\beta\to\infty$, and where the blue line represents the upper bound on the expectation.

Theorems & Definitions (20)

  • Lemma 1: Pinsker's Inequality
  • Theorem 1
  • Lemma 2
  • Lemma 3
  • Theorem 2
  • Lemma 4
  • Theorem 3
  • Lemma 5
  • Lemma 6
  • Conjecture 1
  • ...and 10 more