On a divergence-based prior analysis of stick-breaking processes
José A. Perusquía, Mario Diaz, Ramsés H. Mena
TL;DR
The paper investigates how divergent stick-breaking priors, including the Dirichlet process and the geometric process, differ in their induced random measures by deriving mean and variance results for the Kullback-Leibler divergence between priors. By analyzing both uncoupled and coupled length-variable scenarios and extending to exchangeable length variables driven by a Dirichlet process, the authors quantify when complex SBPs are close to simpler ones in total variation. Key contributions include closed-form-like expressions and bounds for D_KL, continuity and limiting results as hyperparameters vary, and practical guidance for prior selection via Occam's razor. The work provides a rigorous, quantitative framework to balance prior flexibility against computational and mathematical tractability in Bayesian nonparametrics.
Abstract
The nonparametric view of Bayesian inference has transformed statistics and many of its applications. The canonical Dirichlet process and other more general families of nonparametric priors have served as a gateway to solve frontier uncertainty quantification problems of large, or infinite, nature. This success has been greatly due to available constructions and representations of such distributions, the two most useful constructions are the one based on normalization of homogeneous completely random measures and that based on stick-breaking processes. Hence, understanding their distributional features and how different random probability measures compare among themselves is a key ingredient for their proper application. In this paper, we analyse the discrepancy among some nonparametric priors employed in the literature. Initially, we compute the mean and variance of the random Kullback-Leibler divergence between the Dirichlet process and the geometric process. Subsequently, we extend our analysis to encompass a broader class of exchangeable stick-breaking processes, which includes the Dirichlet and geometric processes as extreme cases. Our results establish quantitative conditions where all the aforementioned priors are close in total variation distance. In such instances, adhering to Occam's razor principle advocates for the preference of the simpler process.
