Bayesian calculus and predictive characterizations of extended feature allocation models
Mario Beraha, Federico Camerlenghi, Lorenzo Ghilotti
TL;DR
This work develops a cohesive Bayesian framework for extended feature allocation models that permit interactions among features and weight dependencies, grounding the analysis in point-process theory and Palm calculus. It derives closed-form marginal, posterior, and predictive distributions without restricting the prior to acrm forms, and introduces two predictive-sufficientness postulates: one where unseen features depend only on the sample size $n$ (characterized by Poisson priors) and another where dependence also involves the number of observed features $k$ (via mixed Poisson/binomial priors). The paper specializes to notable priors, including Poisson, mixed Poisson, mixed binomial, and a novel determinantal point process prior, the latter enabling predictive dependence on observed feature labels and capturing repulsion among features. A key byproduct is a new Palm-calculus-based characterization of the Poisson process and tractable posterior forms for extended models, with practical demonstrations in spatial statistics, notably unseen-forest size estimation and localization of unseen trees. Overall, the framework provides principled guidance for prior elicitation in feature allocations and expands the toolkit for Bayesian nonparametrics by integrating predictive characterizations with interacting feature structures.
Abstract
We introduce and study a unified Bayesian framework for extended feature allocations which flexibly captures interactions -- such as repulsion or attraction -- among features and their associated weights. We provide a complete Bayesian analysis of the proposed model and specialize our general theory to noteworthy classes of priors. This includes a novel prior based on determinantal point processes, for which we show promising results in a spatial statistics application. Within the general class of extended feature allocations, we further characterize those priors that yield predictive probabilities of discovering new features depending either solely on the sample size or on both the sample size and the distinct number of observed features. These predictive characterizations, known as "sufficientness" postulates, have been extensively studied in the literature on species sampling models starting from the seminal contribution of the English philosopher W.E. Johnson for the Dirichlet distribution. Within the feature allocation setting, existing predictive characterizations are limited to very specific examples; in contrast, our results are general, providing practical guidance for prior selection. Additionally, our approach, based on Palm calculus, is analytical in nature and yields a novel characterization of the Poisson point process through its reduced Palm kernel.
