Table of Contents
Fetching ...

Asymptotic Inference for Exchangeable Gibbs Partitions

Takuya Koriyama

Abstract

We study the asymptotic properties of parameter estimation and predictive inference under the exchangeable Gibbs partition, characterized by a discount parameter $α\in(0,1)$ and a triangular array $v_{n,k}$ satisfying a backward recursion. Assuming that $v_{n,k}$ admits a mixture representation over the Ewens--Pitman family $(α, θ)$, with $θ$ integrated by an unknown mixing distribution, we show that the (quasi) maximum likelihood estimator $\hatα_n$ (QMLE) for $α$ is asymptotically mixed normal. This generalizes earlier results for the Ewens--Pitman model to a more general class. We further study the predictive task of estimating the probability simplex $\mathsf{p}_n$, which governs the allocation of the $(n+1)$-th item, conditional on the current partition of $[n]$. Based on the asymptotics of the QMLE $\hatα_n$, we construct an estimator $\hat{\mathsf{p}}_n$ and derive the limit distributions of the $f$-divergence $\mathsf{D}_f(\hat{\mathsf{p}}_n||\mathsf{p}_n)$ for general convex functions $f$, including explicit results for the TV distance and KL divergence. These results lead to asymptotically valid confidence intervals for both parameter estimation and prediction.

Asymptotic Inference for Exchangeable Gibbs Partitions

Abstract

We study the asymptotic properties of parameter estimation and predictive inference under the exchangeable Gibbs partition, characterized by a discount parameter and a triangular array satisfying a backward recursion. Assuming that admits a mixture representation over the Ewens--Pitman family , with integrated by an unknown mixing distribution, we show that the (quasi) maximum likelihood estimator (QMLE) for is asymptotically mixed normal. This generalizes earlier results for the Ewens--Pitman model to a more general class. We further study the predictive task of estimating the probability simplex , which governs the allocation of the -th item, conditional on the current partition of . Based on the asymptotics of the QMLE , we construct an estimator and derive the limit distributions of the -divergence for general convex functions , including explicit results for the TV distance and KL divergence. These results lead to asymptotically valid confidence intervals for both parameter estimation and prediction.

Paper Structure

This paper contains 27 sections, 22 theorems, 237 equations, 2 figures, 1 table, 1 algorithm.

Key Result

Theorem 2.1

Let assumption be satisfied. Then, there exists a positive random variable $\mathsf{S}_{\alpha, \mu}$ with bounded second moment such that Furthermore, the first moment of the almost sure limit $\mathsf{S}_{\alpha, \mu}$ is bounded as $\blacktriangleleft$$\blacktriangleleft$

Figures (2)

  • Figure 1: Histogram of the number of nonempty sets $\mathsf{k}_n$ normalized by $n^{\alpha}$. Parameters are taken as $n=50000$, $\alpha=0.3$, and sample size = $10000$.
  • Figure 2: QQ plots comparing empirical quantiles of normalized statistics to the corresponding theoretical distributions (see \ref{['eq:result_summary']}) with parameters $n=20000$, $\alpha=0.8$, and sample size =$1000$.

Theorems & Definitions (35)

  • Theorem 2.1: $\alpha$-diversity
  • Remark 1
  • Theorem 2.2
  • Proposition 1: koriyama2022asymptotic
  • Definition 3.1: Quasi-Maximum Likelihood Estimator
  • Proposition 2
  • Definition 3.2: hausler2015stable
  • Lemma 1: hausler2015stable
  • Theorem 3.1: Asymptotic Mixed Normality
  • Corollary 1: Confidence interval of $\alpha$
  • ...and 25 more