Table of Contents
Fetching ...

Tabular Data: Is Deep Learning all you need?

Guri Zabërgja, Arlind Kadra, Christian M. M. Frey, Josif Grabocka

TL;DR

The paper conducts a large-scale, fair comparison of 17 tabular-classification methods across 68 OpenML datasets using nested cross-validation and thorough hyperparameter optimization, revealing a paradigm shift where Deep Learning methods prevail over traditional gradient-boosted trees. It highlights the superior performance of meta-learned foundation models (e.g., TabICL, TabPFNv2) in most data regimes, while showing that refitting on combined train+validation data after HPO can further improve predictive quality and alter model rankings. The authors also assess transfer-learning paradigms, demonstrating in-context learning often outperforms fine-tuning, and provide an in-depth analysis of HPO effects and hyperparameter importance. An open-source benchmark with extensive results and a transparent experimental protocol aims to standardize future research and accelerate progress in tabular-data deep learning and AutoML.

Abstract

Tabular data represent one of the most prevalent data formats in applied machine learning, largely because they accommodate a broad spectrum of real-world problems. Existing literature has studied many of the shortcomings of neural architectures on tabular data and has repeatedly confirmed the scalability and robustness of gradient-boosted decision trees across varied datasets. However, recent deep learning models have not been subjected to a comprehensive evaluation under conditions that allow for a fair comparison with existing classical approaches. This situation motivates an investigation into whether recent deep-learning paradigms outperform classical ML methods on tabular data. Our survey fills this gap by benchmarking seventeen state-of-the-art methods, spanning neural networks, classical ML and AutoML techniques. Our empirical results over 68 diverse datasets from a well-established benchmark indicate a paradigm shift, where Deep Learning methods outperform classical approaches.

Tabular Data: Is Deep Learning all you need?

TL;DR

The paper conducts a large-scale, fair comparison of 17 tabular-classification methods across 68 OpenML datasets using nested cross-validation and thorough hyperparameter optimization, revealing a paradigm shift where Deep Learning methods prevail over traditional gradient-boosted trees. It highlights the superior performance of meta-learned foundation models (e.g., TabICL, TabPFNv2) in most data regimes, while showing that refitting on combined train+validation data after HPO can further improve predictive quality and alter model rankings. The authors also assess transfer-learning paradigms, demonstrating in-context learning often outperforms fine-tuning, and provide an in-depth analysis of HPO effects and hyperparameter importance. An open-source benchmark with extensive results and a transparent experimental protocol aims to standardize future research and accelerate progress in tabular-data deep learning and AutoML.

Abstract

Tabular data represent one of the most prevalent data formats in applied machine learning, largely because they accommodate a broad spectrum of real-world problems. Existing literature has studied many of the shortcomings of neural architectures on tabular data and has repeatedly confirmed the scalability and robustness of gradient-boosted decision trees across varied datasets. However, recent deep learning models have not been subjected to a comprehensive evaluation under conditions that allow for a fair comparison with existing classical approaches. This situation motivates an investigation into whether recent deep-learning paradigms outperform classical ML methods on tabular data. Our survey fills this gap by benchmarking seventeen state-of-the-art methods, spanning neural networks, classical ML and AutoML techniques. Our empirical results over 68 diverse datasets from a well-established benchmark indicate a paradigm shift, where Deep Learning methods outperform classical approaches.
Paper Structure (45 sections, 1 equation, 25 figures, 35 tables, 1 algorithm)

This paper contains 45 sections, 1 equation, 25 figures, 35 tables, 1 algorithm.

Figures (25)

  • Figure 1: Taxonomy tree of algorithms applied to tabular classification (TC) models
  • Figure 2: Left: Distribution of ranks for the Deep Learning ($12$ methods), Classical ML ($3$ methods) and AutoML ($1$ method) classifier families. Right: Distribution of ranks for the Foundation Models ($5$ methods), Dataset-Specific ($7$ methods) and AutoML ($1$ method) classifier families. The boxplots illustrate the rank spread, with medians represented by black lines, diamonds representing the means, and whiskers showing the range.
  • Figure 3: Win-rate dueling matrix comparing learning methods across shared datasets. Each cell (row $i$, column $j$) shows the fraction of common datasets on which method $i$ outperforms method $j$.
  • Figure 4: Dataset landscape showing winning method families across different dataset sizes. Each point represents a dataset from the OpenMLCC18 benchmark, positioned by number of rows (x-axis) and features (y-axis) on log scales. Colors indicate which method family achieved the highest accuracy: Deep Learning methods (orange), Classical ML tree-based models (green), and ties (gray).
  • Figure 5: Critical difference (CD) diagram of the methods, where a horizontal bar indicates the absence of statistical significance. Left: CD diagram of Deep Learning vs. GBDTs, Right: CD diagram of dataset-specific vs. foundation models.
  • ...and 20 more figures