Table of Contents
Fetching ...

Real-Time Explanations for Tabular Foundation Models

Luan Borges Teodoro Reis Sena, Francisco Galuppo Azevedo

Abstract

Interpretability is central for scientific machine learning, as understanding \emph{why} models make predictions enables hypothesis generation and validation. While tabular foundation models show strong performance, existing explanation methods like SHAP are computationally expensive, limiting interactive exploration. We introduce ShapPFN, a foundation model that integrates Shapley value regression directly into its architecture, producing both predictions and explanations in a single forward pass. On standard benchmarks, ShapPFN achieves competitive performance while producing high-fidelity explanations ($R^2$=0.96, cosine=0.99) over 1000\times faster than KernelSHAP (0.06s vs 610s). Our code is available at https://github.com/kunumi/ShapPFN

Real-Time Explanations for Tabular Foundation Models

Abstract

Interpretability is central for scientific machine learning, as understanding \emph{why} models make predictions enables hypothesis generation and validation. While tabular foundation models show strong performance, existing explanation methods like SHAP are computationally expensive, limiting interactive exploration. We introduce ShapPFN, a foundation model that integrates Shapley value regression directly into its architecture, producing both predictions and explanations in a single forward pass. On standard benchmarks, ShapPFN achieves competitive performance while producing high-fidelity explanations (=0.96, cosine=0.99) over 1000\times faster than KernelSHAP (0.06s vs 610s). Our code is available at https://github.com/kunumi/ShapPFN

Paper Structure

This paper contains 13 sections, 3 equations, 1 figure, 3 tables.

Figures (1)

  • Figure 1: ShapPFN architecture. The model encodes features and targets into embeddings, processes them through transformer layers with alternating attention over features and datapoints, then decodes into a global baseline and per-feature contributions. The final prediction is explicitly additive over features, enabling SHAP-like decomposition directly from the architecture.