Table of Contents
Fetching ...

From Tables to Signals: Revealing Spectral Adaptivity in TabPFN

Jianqiao Zheng, Cameron Gordon, Yiping Ji, Hemanth Saratchandran, Simon Lucey

TL;DR

This work reframes TabPFN as a model whose inductive biases are data-conditioned rather than fixed by architecture alone, by analyzing it through a signal-reconstruction lens. The authors introduce a context kernel and demonstrate spectral adaptivity: TabPFN’s effective bandwidth expands as the number of in-context samples increases, and positional encodings can steer frequency response to reveal higher-frequency details. They show training-free image denoising and provide a pathway linking tabular foundation models with implicit neural representations. The results suggest a flexible, data-driven inductive bias that can be tuned via input encoding and context size for fast signal reconstruction tasks without gradient-based optimization.

Abstract

Task-agnostic tabular foundation models such as TabPFN have achieved impressive performance on tabular learning tasks, yet the origins of their inductive biases remain poorly understood. In this work, we study TabPFN through the lens of signal reconstruction and provide the first frequency-based analysis of its in-context learning behavior. We show that TabPFN possesses a broader effective frequency capacity than standard ReLU-MLPs, even without hyperparameter tuning. Moreover, unlike MLPs whose spectra evolve primarily over training epochs, we find that TabPFN's spectral capacity adapts directly to the number of samples provided in-context, a phenomenon we term Spectral Adaptivity. We further demonstrate that positional encoding modulates TabPFN's frequency response, mirroring classical results in implicit neural representations. Finally, we show that these properties enable TabPFN to perform training-free and hyperparameter-free image denoising, illustrating its potential as a task-agnostic implicit model. Our analysis provides new insight into the structure and inductive biases of tabular foundation models and highlights their promise for broader signal reconstruction tasks.

From Tables to Signals: Revealing Spectral Adaptivity in TabPFN

TL;DR

This work reframes TabPFN as a model whose inductive biases are data-conditioned rather than fixed by architecture alone, by analyzing it through a signal-reconstruction lens. The authors introduce a context kernel and demonstrate spectral adaptivity: TabPFN’s effective bandwidth expands as the number of in-context samples increases, and positional encodings can steer frequency response to reveal higher-frequency details. They show training-free image denoising and provide a pathway linking tabular foundation models with implicit neural representations. The results suggest a flexible, data-driven inductive bias that can be tuned via input encoding and context size for fast signal reconstruction tasks without gradient-based optimization.

Abstract

Task-agnostic tabular foundation models such as TabPFN have achieved impressive performance on tabular learning tasks, yet the origins of their inductive biases remain poorly understood. In this work, we study TabPFN through the lens of signal reconstruction and provide the first frequency-based analysis of its in-context learning behavior. We show that TabPFN possesses a broader effective frequency capacity than standard ReLU-MLPs, even without hyperparameter tuning. Moreover, unlike MLPs whose spectra evolve primarily over training epochs, we find that TabPFN's spectral capacity adapts directly to the number of samples provided in-context, a phenomenon we term Spectral Adaptivity. We further demonstrate that positional encoding modulates TabPFN's frequency response, mirroring classical results in implicit neural representations. Finally, we show that these properties enable TabPFN to perform training-free and hyperparameter-free image denoising, illustrating its potential as a task-agnostic implicit model. Our analysis provides new insight into the structure and inductive biases of tabular foundation models and highlights their promise for broader signal reconstruction tasks.

Paper Structure

This paper contains 21 sections, 2 theorems, 9 equations, 14 figures, 1 algorithm.

Key Result

Proposition 1

For a ReLU‑MLP with fixed architecture and initialization, the spectral distribution of its NTK, and consequently its spectral bias, remains invariant with respect to the number of training samples $N$.

Figures (14)

  • Figure 1: Signal reconstruction results at various sampling rates (columns) using TabPFN, ReLU-MLPs, and ReLU-MLPs with positional encoding (PE). ReLU-MLPs fail to reconstruct high-frequency components regardless of the amount of data provided, indicating that their spectrum is essentially insensitive to the number of samples. Positional encoding methods can extend their high-frequency capacity; however, hyperparameters tuned for one sampling rate (rightmost column) may cause unstable and noisy reconstructions at other sampling rates (leftmost column), requiring careful task-specific selection. In contrast, we find that the spectral capacity of TabPFN naturally adapts to the number of context samples—a phenomenon we refer to as Spectral Adaptivity.
  • Figure 1: Singular value decomposition of Layer 8 self-attention maps (subsampled to same size of $64\times64$ from sample size) computed over different numbers of context samples ($64$–$4096$). As shown in \ref{['fig:tab_attn_map']}, the attention maps become increasingly sharp and structured with more samples. The SVD results provide a quantitative view of this trend: attention maps obtain progressively richer spectra as the number of context samples increases, indicating higher-rank and more expressive attention patterns.
  • Figure 2: Recasting 1-D signal as tabular dataset. By representing sampled points $(x_i, y_i)$ as rows of a table, TabPFN can be applied to signal reconstruction: the model predicts $\hat{y}$ for query inputs $(\hat{x})$ in a single forward pass. Framing signals as tabular data enables analysis of TabPFN from a signal reconstruction perspective.
  • Figure 2: Self-attention maps of TabPFN across transformer layers and different numbers of context samples. Columns correspond to context sample sizes ($64$, $256$, $512$, $1024$, $4096$), and rows correspond to Layers 1, 4, 8, and 12. All attention maps are averaged over 6 heads and subsampled to $64\times64$ for consistent visual comparison. As sampling increases, the attention maps become sharper, more structured, and more concentrated around local neighborhoods. These qualitative changes reflect the model’s increasingly expressive internal representations and complement the quantitative SVD analysis in \ref{['fig:attn_map_svd']}, evidencing TabPFN’s sample-dependent attention behavior.
  • Figure 3: Normalized eigenvalue spectra of the empirical NTKs for ReLU-MLPs of varying depths, evaluated with 64, 512, and 4096 training samples. Across different sampling densities, the eigenvalue decay profiles remain unchanged, whereas network depth induces clear separations between curves. This demonstrates that NTK spectral properties are determined by the model architecture and are agnostic to the number of training samples.
  • ...and 9 more figures

Theorems & Definitions (5)

  • Remark 1: Spectral Bias pmlr-v97-rahaman19a-spectral-bias
  • Proposition 1: Data‑Agnostic Spectral Bias in ReLU‑MLPs
  • Definition 1: Context Kernel
  • Proposition 2
  • Remark 2: Spectral Adaptivity