Table of Contents
Fetching ...

TabMGP: Martingale posterior with TabPFN

Kenyon Ng, Edwin Fong, David T. Frazier, Jeremias Knoblauch, Susan Wei

TL;DR

TabMGP introduces a novel martingale-posterior framework powered by TabPFN to obtain uncertainty about functionals $ heta(F_ inf)$ for tabular data without specifying a prior or likelihood. By forward sampling with TabPFN (and Bayesian bootstrap for covariates), it generates empirical samples $ heta(F_N)$ that approximate the posterior of $ heta(F_ inf)$ given data, yielding near-nominal coverage and expressive posteriors. Despite lacking strict martingale or a.c.i.d. guarantees for the predictive rule, TabMGP demonstrates robust uncertainty quantification across synthetic and real datasets, often outperforming handcrafted MGPs and standard Bayesian baselines. The work also highlights gaps in current theory, advocating empirical diagnostics and weaker conditions to justify validity in the era of foundation-model predictors.

Abstract

Bayesian inference provides principled uncertainty quantification but is often limited by challenges of prior and likelihood elicitation. The martingale posterior (MGP) (Fong et al., 2023) offers an alternative by replacing these requirements with a predictive rule. Additionally MGP focuses inference on parameters defined through a loss function. This framework is especially resonant in the era of foundation transformers; practitioners increasingly leverage models like TabPFN for their state-of-the-art capabilities, yet often require epistemic uncertainty for a scientific estimand $θ$ that need not parameterise the model's implicit latent model. The MGP provides the mechanism to recover these posterior distributions. We introduce TabMGP, an MGP built on TabPFN for tabular data. TabMGP produces credible sets with near-nominal coverage and often outperforms both handcrafted MGP constructions and standard Bayesian baselines.

TabMGP: Martingale posterior with TabPFN

TL;DR

TabMGP introduces a novel martingale-posterior framework powered by TabPFN to obtain uncertainty about functionals for tabular data without specifying a prior or likelihood. By forward sampling with TabPFN (and Bayesian bootstrap for covariates), it generates empirical samples that approximate the posterior of given data, yielding near-nominal coverage and expressive posteriors. Despite lacking strict martingale or a.c.i.d. guarantees for the predictive rule, TabMGP demonstrates robust uncertainty quantification across synthetic and real datasets, often outperforming handcrafted MGPs and standard Bayesian baselines. The work also highlights gaps in current theory, advocating empirical diagnostics and weaker conditions to justify validity in the era of foundation-model predictors.

Abstract

Bayesian inference provides principled uncertainty quantification but is often limited by challenges of prior and likelihood elicitation. The martingale posterior (MGP) (Fong et al., 2023) offers an alternative by replacing these requirements with a predictive rule. Additionally MGP focuses inference on parameters defined through a loss function. This framework is especially resonant in the era of foundation transformers; practitioners increasingly leverage models like TabPFN for their state-of-the-art capabilities, yet often require epistemic uncertainty for a scientific estimand that need not parameterise the model's implicit latent model. The MGP provides the mechanism to recover these posterior distributions. We introduce TabMGP, an MGP built on TabPFN for tabular data. TabMGP produces credible sets with near-nominal coverage and often outperforms both handcrafted MGP constructions and standard Bayesian baselines.

Paper Structure

This paper contains 21 sections, 29 equations, 12 figures, 12 tables, 1 algorithm.

Figures (12)

  • Figure 1: TabMGP for obtaining posterior samples of $\theta(F_\infty) \mid z_{1:n}$, where each $z = (x, y)$ represents a covariate-response pair. We perform forward sampling with TabPFN to generate $L$ independent continuations of the observed dataset $z_{1:n}$. Since TabPFN does not model covariates, the forward sampling of $x_i$ is performed via a separate process, while TabPFN provides the conditional response $y_i \sim \mathrm{TabPFN}(\cdot \mid x_i, z_{1:i-1})$. At the termination of each rollout $l$, we form the empirical measure $F_N^{(l)} = \frac{1}{N} \left(\sum_{i=1}^{n} \delta_{z_i} + \sum_{i=n+1}^{N} \delta_{z_{i}^{(l)}}\right)$ and collect $\theta(F_N^{(l)})$ as one approximate posterior sample.
  • Figure 2: Concentration of TabMGP. The black vertical line indicates the population risk minimiser $\theta(F^{\star}) = (0, {\beta^{\star}}^{\top})$. See Appendix \ref{['sec:tabmgp-validity']} for experimental details.
  • Figure 3: Expected $L_{1}$-norm between $\theta(F_{N})$ from TabMGP and $\theta(F_{n})$ as $N$ increases. Each of the 30 trajectories corresponds to a realisation of $z_{1:n}$ from one setup.
  • Figure 4: Posterior densities of the intercept in the 'concrete' setup. For each posterior density, the 95% marginal credible interval is shown as a horizontal bar, and the posterior mean is marked. The solid and dashed black vertical lines correspond to $\theta(F^{\star})$ and $\theta(F_{n})$ respectively.
  • Figure 5: Cumulative sum (over $N$) of the $L_{1}$-distance between $\mathbb{E}_{y_{i+1}}[p_{i+1}(y \mid x^{\star}) \mid z_{1:i}]$ and $p_i(y \mid x^{\star})$ across ten choices of $x^{\star}$. Ideally, this cumulative sum converges to satisfy the a.c.i.d. condition.
  • ...and 7 more figures