Table of Contents
Fetching ...

A unified construction for series representations and finite approximations of completely random measures

Juho Lee, Xenia Miscouridou, François Caron

TL;DR

This work unifies and extends the construction of series representations and finite-dimensional approximations for infinite-activity completely random measures by embedding Poisson random measures in an augmented space and applying arrival-time kernels. The framework encompasses inverse-Lévy and size-biased representations as special cases, and yields new series and iid representations for important CRMs such as the generalized gamma process and stable beta process, along with systematic truncation-error analysis. It provides concrete instances (deterministic, exponential, gamma, inverse gamma, and generalized Pareto arrival times) and derives tractable, conjugate-like distributions (e.g., etBFRY, gbfrY) that facilitate simulation and posterior inference. The truncation results quantify approximation accuracy for functionals and marginal likelihoods, enabling reliable finite-dimensional approximations for Bayesian nonparametric modeling. Overall, the paper offers a practical, theoretically grounded toolkit for scalable inference with infinite-activity CRMs and their common instantiations.

Abstract

Infinite-activity completely random measures (CRMs) have become important building blocks of complex Bayesian nonparametric models. They have been successfully used in various applications such as clustering, density estimation, latent feature models, survival analysis or network science. Popular infinite-activity CRMs include the (generalized) gamma process and the (stable) beta process. However, except in some specific cases, exact simulation or scalable inference with these models is challenging and finite-dimensional approximations are often considered. In this work, we propose a general and unified framework to derive both series representations and finite-dimensional approximations of CRMs. Our framework can be seen as an extension of constructions based on size-biased sampling of Poisson point process [Perman1992]. It includes as special cases several known series representations as well as novel ones. In particular, we show that one can get novel series representations for the generalized gamma process and the stable beta process. We also provide some analysis of the truncation error.

A unified construction for series representations and finite approximations of completely random measures

TL;DR

This work unifies and extends the construction of series representations and finite-dimensional approximations for infinite-activity completely random measures by embedding Poisson random measures in an augmented space and applying arrival-time kernels. The framework encompasses inverse-Lévy and size-biased representations as special cases, and yields new series and iid representations for important CRMs such as the generalized gamma process and stable beta process, along with systematic truncation-error analysis. It provides concrete instances (deterministic, exponential, gamma, inverse gamma, and generalized Pareto arrival times) and derives tractable, conjugate-like distributions (e.g., etBFRY, gbfrY) that facilitate simulation and posterior inference. The truncation results quantify approximation accuracy for functionals and marginal likelihoods, enabling reliable finite-dimensional approximations for Bayesian nonparametric modeling. Overall, the paper offers a practical, theoretically grounded toolkit for scalable inference with infinite-activity CRMs and their common instantiations.

Abstract

Infinite-activity completely random measures (CRMs) have become important building blocks of complex Bayesian nonparametric models. They have been successfully used in various applications such as clustering, density estimation, latent feature models, survival analysis or network science. Popular infinite-activity CRMs include the (generalized) gamma process and the (stable) beta process. However, except in some specific cases, exact simulation or scalable inference with these models is challenging and finite-dimensional approximations are often considered. In this work, we propose a general and unified framework to derive both series representations and finite-dimensional approximations of CRMs. Our framework can be seen as an extension of constructions based on size-biased sampling of Poisson point process [Perman1992]. It includes as special cases several known series representations as well as novel ones. In particular, we show that one can get novel series representations for the generalized gamma process and the stable beta process. We also provide some analysis of the truncation error.

Paper Structure

This paper contains 55 sections, 11 theorems, 141 equations, 5 figures, 1 table.

Key Result

Theorem 3.1

Let $\lambda_w$ be a parametric distribution on $(0,\infty)$ with parameter $w>0$ and $\Lambda_w$ be the associated parametric cumulative density function (cdf) satisfying condition eq:condLambda. Consider the conditional distributions where The sequential construction$G=\sum_{i=1}^\infty W_i\delta_{\theta_i}$ is obtained as follows, for $i\geq 1$ The truncated exchangeable construction$G_n=\sum

Figures (5)

  • Figure 1: (Left) Constant $C_1(\sigma)$ for deterministic and gamma arrival times; (Middle-Right) Simulated error $R_n$ with gamma arrival times for (Middle) stable process and (Right) GGP.
  • Figure 2: Gamma arrival times, inverse gamma arrival times for stable processes (a-b) and gamma arrival times for GGP (c). Plotted for whole range. See how variances diminishes. Note also that in case of inverse gamma arrival times with $\kappa=1$, the variances diverge as our theory predicts.
  • Figure 3: $C_1(\sigma)$ values for Gamma (a), inverse gamma (b) arrival times, and comparison between them (c).
  • Figure 4: Gamma arrival times vs. inverse gamma arrival times for stable process, with $\sigma=0.4$ (a) and $\sigma=0.7$ (b). Gamma is better when $\sigma < 0.5$, and inverse gamma is better when $\sigma > 0.5$ as predicted in \ref{['fig:C1']} (c).
  • Figure 5: Empirical approximation error $R_{n,\hat{n}}$ compared to asymptotic error $R_n$, for (a) $\sigma=0.4$ and (b) $\sigma=0.7$.

Theorems & Definitions (18)

  • Remark 2.1
  • Theorem 3.1
  • Proposition 3.1
  • Proposition 5.1
  • Proposition 5.2
  • Proposition 5.3
  • Definition A.1: Slowly varying function
  • Definition A.2: Regularly varying function
  • Theorem A.1: Karamata's theorem
  • Corollary A.1
  • ...and 8 more