Table of Contents
Fetching ...

Concentration of discrepancy-based approximate Bayesian computation via Rademacher complexity

Sirio Legramanti, Daniele Durante, Pierre Alquier

TL;DR

This work addresses the concentration behavior of discrepancy-based ABC posteriors by linking their asymptotics to the Rademacher complexity of the discrepancy’s function class within integral probability semimetrics (IPS). It offers a unified theory that yields uniform, constructible concentration bounds applicable to misspecified and non-i.i.d. data, without requiring strong regularity conditions on the data-generating process. The authors derive general results for fixed and shrinking tolerance regimes, and specialize them to MMD and Wasserstein-1 distances, providing explicit rates under bounded and unbounded settings. Illustrative simulations corroborate the theory, showing robust performance of IPS discrepancies with uniformly vanishing Rademacher complexity under model misspecification and contamination. The framework guides principled discrepancy choice and opens avenues for extending discrepancy-based ABC to broader pseudo-posterior settings and $f$-divergences.

Abstract

There has been increasing interest on summary-free solutions for approximate Bayesian computation (ABC) which replace distances among summaries with discrepancies between the empirical distributions of the observed data and the synthetic samples generated under the proposed parameter values. The success of these strategies has motivated theoretical studies on the limiting properties of the induced posteriors. However, there is still the lack of a theoretical framework for summary-free ABC that (i) is unified, instead of discrepancy-specific, (ii) does not require to constrain the analysis to data generating processes and statistical models meeting specific regularity conditions, but rather facilitates the derivation of limiting properties that hold uniformly, and (iii) relies on verifiable assumptions that provide explicit concentration bounds clarifying which factors govern the limiting behavior of the ABC posterior. We address this gap via a novel theoretical framework that introduces the concept of Rademacher complexity in the analysis of the limiting properties for discrepancy-based ABC posteriors, including in non-i.i.d. and misspecified settings. This yields a unified theory that relies on constructive arguments and provides more informative asymptotic results and uniform concentration bounds, even in settings not covered by current studies. These advancements are obtained by relating the asymptotic properties of summary-free ABC posteriors to the behavior of the Rademacher complexity associated with the chosen discrepancy in the family of integral probability semimetrics (IPS). The IPS class extends summary-based distances, and includes the Wasserstein distance and maximum mean discrepancy, among others. As clarified in specialized theoretical analyses of popular IPS discrepancies and via illustrative simulations, this perspective improves the understanding of summary-free ABC.

Concentration of discrepancy-based approximate Bayesian computation via Rademacher complexity

TL;DR

This work addresses the concentration behavior of discrepancy-based ABC posteriors by linking their asymptotics to the Rademacher complexity of the discrepancy’s function class within integral probability semimetrics (IPS). It offers a unified theory that yields uniform, constructible concentration bounds applicable to misspecified and non-i.i.d. data, without requiring strong regularity conditions on the data-generating process. The authors derive general results for fixed and shrinking tolerance regimes, and specialize them to MMD and Wasserstein-1 distances, providing explicit rates under bounded and unbounded settings. Illustrative simulations corroborate the theory, showing robust performance of IPS discrepancies with uniformly vanishing Rademacher complexity under model misspecification and contamination. The framework guides principled discrepancy choice and opens avenues for extending discrepancy-based ABC to broader pseudo-posterior settings and -divergences.

Abstract

There has been increasing interest on summary-free solutions for approximate Bayesian computation (ABC) which replace distances among summaries with discrepancies between the empirical distributions of the observed data and the synthetic samples generated under the proposed parameter values. The success of these strategies has motivated theoretical studies on the limiting properties of the induced posteriors. However, there is still the lack of a theoretical framework for summary-free ABC that (i) is unified, instead of discrepancy-specific, (ii) does not require to constrain the analysis to data generating processes and statistical models meeting specific regularity conditions, but rather facilitates the derivation of limiting properties that hold uniformly, and (iii) relies on verifiable assumptions that provide explicit concentration bounds clarifying which factors govern the limiting behavior of the ABC posterior. We address this gap via a novel theoretical framework that introduces the concept of Rademacher complexity in the analysis of the limiting properties for discrepancy-based ABC posteriors, including in non-i.i.d. and misspecified settings. This yields a unified theory that relies on constructive arguments and provides more informative asymptotic results and uniform concentration bounds, even in settings not covered by current studies. These advancements are obtained by relating the asymptotic properties of summary-free ABC posteriors to the behavior of the Rademacher complexity associated with the chosen discrepancy in the family of integral probability semimetrics (IPS). The IPS class extends summary-based distances, and includes the Wasserstein distance and maximum mean discrepancy, among others. As clarified in specialized theoretical analyses of popular IPS discrepancies and via illustrative simulations, this perspective improves the understanding of summary-free ABC.
Paper Structure (19 sections, 13 theorems, 78 equations, 1 figure, 2 tables)

This paper contains 19 sections, 13 theorems, 78 equations, 1 figure, 2 tables.

Key Result

Lemma 2.6

Let $x_{1:n}$ be i.i.d. from some distribution $\mu \in \mathcal{P}(\mathcal{Y})$. Then, for any $b$-uniformly bounded class $\mathfrak{F}$, i.e., any class $\mathfrak{F}$ of functions $f$ such that $\|f\|_\infty\leq b$, any integer $n \geq 1$ and scalar $\delta \geq 0$, it holds that and $\mathbb{P}_{x_{1:n}} [ \mathcal{D}_{\mathfrak{F}}(\hat{\mu}_{x_{1:n}},\mu) \geq \mathfrak{R}_{\mu,n}(\math

Figures (1)

  • Figure 1: Graphical representation of the abc posterior for $\theta$ under mmd with Gaussian kernel, Wasserstein-1 distance, summary-based distance (mean) and kl divergence for one simulated dataset from a misspecified Huber contamination model with a varying $\alpha\in\{0.05, 0.10, 0.15\}$; see also Example \ref{['exm_huber']} for details. The red dashed line corresponds to the location parameter $\theta_0=1$ of the uncontaminated model.

Theorems & Definitions (41)

  • Definition 2.1: Integral probability semimetric --- ips
  • Example 2.2
  • Example 2.3
  • Example 2.4
  • Definition 2.5: Rademacher complexity
  • Lemma 2.6: Theorem 4.10 and Proposition 4.12 in wainwright2019high
  • Theorem 3.1
  • Corollary 3.2
  • Theorem 3.3
  • Example 3.4
  • ...and 31 more