Mixture of In-Context Prompters for Tabular PFNs
Derek Xu, Olcay Cirit, Reza Asadi, Yizhou Sun, Wei Wang
TL;DR
MixturePFN tackles the core scalability bottleneck of PFN-based ICL for tabular data by introducing MICP, a sparse routing mechanism that assigns test samples to specialized prompters with small, fixed prompts, reducing inference cost from $O(N_{train}^2)$ memory/time to $O(1)$ memory and $O(\,\log N_{train}\,)$ time, respectively. To further boost performance and alignment with inference-time data, CaPFN finetunes the frozen PFN using bootstrapped prompts via adapters, capturing the downstream dataset distribution without full fine-tuning. Empirically, MixturePFN achieves state-of-the-art results on the TabZilla benchmark across 36 datasets and 19 baselines, with Condorcet-winning performance and statistically significant gains, and demonstrates robust scalability across dataset sizes and irregularities. Overall, the method provides a scalable, high-performing framework for tabular ICL, enabling strong generalization and practical applicability in real-world, large-scale tabular datasets. $\text{MixturePFN}$ thus establishes a new standard for scalable, context-aware prompting in tabular learning, balancing efficiency and accuracy through a principled routing and bootstrapping approach.
Abstract
Recent benchmarks found In-Context Learning (ICL) outperforms both deep learning and tree-based algorithms on small tabular datasets. However, on larger datasets, ICL for tabular learning cannot run without severely compromising performance, due to its quadratic space and time complexity w.r.t. dataset size. We propose MIXTUREPFN, which both extends nearest-neighbor sampling to the state-of-the-art ICL for tabular learning model and uses bootstrapping to finetune said model on the inference-time dataset. MIXTUREPFN is the Condorcet winner across 36 diverse tabular datasets against 19 strong deep learning and tree-based baselines, achieving the highest mean rank among Top-10 aforementioned algorithms with statistical significance.
