Table of Contents
Fetching ...

Retrieval & Fine-Tuning for In-Context Tabular Models

Valentin Thomas, Junwei Ma, Rasa Hosseinzadeh, Keyvan Golestan, Guangwei Yu, Maksims Volkovs, Anthony Caterini

TL;DR

This work addresses the scalability gap of transformer-based in-context learning for tabular data by introducing LoCalPFN, a framework that combines kNN retrieval of local neighbours with end-to-end fine-tuning on retrieved samples on top of the TabPFN base. The approach yields state-of-the-art performance across 95 TabZilla/OpenML datasets, outperforming both neural baselines and strongly-tuned tree-based methods, especially on larger and more complex tasks. By demonstrating that local context and joint retrieval and fine-tuning can substantially improve tabular ICL, the paper advances practical deep learning capabilities for tabular domains. The findings highlight the potential of retrieval-augmented, locally calibrated transformers to scale tabular deep learning while maintaining favorable performance and offering new avenues for future tabular foundation models.

Abstract

Tabular data is a pervasive modality spanning a wide range of domains, and the inherent diversity poses a considerable challenge for deep learning. Recent advancements using transformer-based in-context learning have shown promise on smaller and less complex datasets, but have struggled to scale to larger and more complex ones. To address this limitation, we propose a combination of retrieval and fine-tuning: we can adapt the transformer to a local subset of the data by collecting nearest neighbours, and then perform task-specific fine-tuning with this retrieved set of neighbours in context. Using TabPFN as the base model -- currently the best tabular in-context learner -- and applying our retrieval and fine-tuning scheme on top results in what we call a locally-calibrated PFN, or LoCalPFN. We conduct extensive evaluation on 95 datasets curated by TabZilla from OpenML, upon which we establish a new state-of-the-art with LoCalPFN -- even with respect to tuned tree-based models. Notably, we show a significant boost in performance compared to the base in-context model, demonstrating the efficacy of our approach and advancing the frontier of deep learning in tabular data.

Retrieval & Fine-Tuning for In-Context Tabular Models

TL;DR

This work addresses the scalability gap of transformer-based in-context learning for tabular data by introducing LoCalPFN, a framework that combines kNN retrieval of local neighbours with end-to-end fine-tuning on retrieved samples on top of the TabPFN base. The approach yields state-of-the-art performance across 95 TabZilla/OpenML datasets, outperforming both neural baselines and strongly-tuned tree-based methods, especially on larger and more complex tasks. By demonstrating that local context and joint retrieval and fine-tuning can substantially improve tabular ICL, the paper advances practical deep learning capabilities for tabular domains. The findings highlight the potential of retrieval-augmented, locally calibrated transformers to scale tabular deep learning while maintaining favorable performance and offering new avenues for future tabular foundation models.

Abstract

Tabular data is a pervasive modality spanning a wide range of domains, and the inherent diversity poses a considerable challenge for deep learning. Recent advancements using transformer-based in-context learning have shown promise on smaller and less complex datasets, but have struggled to scale to larger and more complex ones. To address this limitation, we propose a combination of retrieval and fine-tuning: we can adapt the transformer to a local subset of the data by collecting nearest neighbours, and then perform task-specific fine-tuning with this retrieved set of neighbours in context. Using TabPFN as the base model -- currently the best tabular in-context learner -- and applying our retrieval and fine-tuning scheme on top results in what we call a locally-calibrated PFN, or LoCalPFN. We conduct extensive evaluation on 95 datasets curated by TabZilla from OpenML, upon which we establish a new state-of-the-art with LoCalPFN -- even with respect to tuned tree-based models. Notably, we show a significant boost in performance compared to the base in-context model, demonstrating the efficacy of our approach and advancing the frontier of deep learning in tabular data.
Paper Structure (29 sections, 2 equations, 11 figures, 9 tables)

This paper contains 29 sections, 2 equations, 11 figures, 9 tables.

Figures (11)

  • Figure 1: a) TabPFN -- even when using the entire training data as context -- underfits and cannot classify patterns such as three pairs of concentric circles of two classes. Decision boundaries are in black and shaded areas show the predicted class. b) Applying an adaptive local context for each point using its $k$ nearest neighbours can easily solve this problem. c) We observe that this approach is robust to the numbers of neighbours used ($k$) even when the dataset complexity increases and always performs better than vanilla TabPFN using full context ($N=1000$). Each point is averaged over 25 seeds.
  • Figure 2: Example of the behaviour of TabPFN and TabPFN-$k$NN as we vary the dataset size and the context length for three large datasets. TabPFN is in shades of green and TabPFN-$k$NN is in shades of blue. The opacity represents the context length used (also labelled on each line). It corresponds to random training samples for TabPFN and nearest neighbours for TabPFN-$k$NN. TabPFN is limited by context size and cannot make efficient use of larger datasets. While for context length $=$ dataset size ($k=N$) TabPFN and TabPFN-$k$NN have the same performance, TabPFN-$k$NN can leverage larger datasets with $k$NN-based contexts and shows improvements, often even for lower context lengths. Each point on this plot is the average of $100$ random resamplings of the data.
  • Figure 3: Details of the architecture and the efficient context used during fine-tuning. a) During inference, for each query $x_\text{qy}$, we compute its $k$NN s and use them as context. b) During fine-tuning, we have a modified procedure allowing shared context between many queries. We first select a random training point, then compute its $k$NN s. Finally we randomly split those into a context and a query set, allowing us to have a shared (yet local) context for many queries, similarly to vanilla TabPFN.
  • Figure 4: Analysis of AUC as a function of size and complexity. TabPFN fails to scale both in size and complexity while LoCalPFN is able to still outperform on the far end of the spectrum. See \ref{['fig:analysis_absolute']} for a version with absolute AUC.
  • Figure 5: Ablations for different design choices on all 95 datasets. Left: Fine-tuning jointly with retrieval yields better performance. Centre: The choice of embeddings for retrieval does not change the performance drastically but can lead to some improvements. Right: Methods using a context that does not depend on the current query do not match the performance of methods that use a local context.
  • ...and 6 more figures