Quasi-Bayes meets Vines
David Huk, Yuanhe Zhang, Mark Steel, Ritabrata Dutta
TL;DR
The paper targets scalable, explicit-density estimation in high dimensions by marrying Quasi-Bayesian predictive densities with Sklar's theorem. It decomposes the joint predictive $oldsymbol{p}^{(n)}(oldsymbol{x})$ into univariate marginal predictives, estimated by fast QB recursions, and a flexible vine copula to capture dependence, yielding an analytical, copula-based density with a convergence rate that can be dimension-independent under simplified vine assumptions. The authors introduce the QB-Vine framework, provide convergence results for marginals and copulas, and demonstrate robustness through energy-score-based hyperparameter tuning and parallelizable computation. Empirically, QB-Vine achieves state-of-the-art or competitive density estimation and supervised task performance on moderate-to-high dimensional data (up to $d\sim 64$) with relatively small training samples, highlighting its data efficiency and scalability compared to neural and other Bayesian approaches.
Abstract
Recently proposed quasi-Bayesian (QB) methods initiated a new era in Bayesian computation by directly constructing the Bayesian predictive distribution through recursion, removing the need for expensive computations involved in sampling the Bayesian posterior distribution. This has proved to be data-efficient for univariate predictions, but extensions to multiple dimensions rely on a conditional decomposition resulting from predefined assumptions on the kernel of the Dirichlet Process Mixture Model, which is the implicit nonparametric model used. Here, we propose a different way to extend Quasi-Bayesian prediction to high dimensions through the use of Sklar's theorem by decomposing the predictive distribution into one-dimensional predictive marginals and a high-dimensional copula. Thus, we use the efficient recursive QB construction for the one-dimensional marginals and model the dependence using highly expressive vine copulas. Further, we tune hyperparameters using robust divergences (eg. energy score) and show that our proposed Quasi-Bayesian Vine (QB-Vine) is a fully non-parametric density estimator with \emph{an analytical form} and convergence rate independent of the dimension of data in some situations. Our experiments illustrate that the QB-Vine is appropriate for high dimensional distributions ($\sim$64), needs very few samples to train ($\sim$200) and outperforms state-of-the-art methods with analytical forms for density estimation and supervised tasks by a considerable margin.
