Table of Contents
Fetching ...

On the Efficiency of NLP-Inspired Methods for Tabular Deep Learning

Anton Frederik Thielmann, Soheila Samiee

TL;DR

This paper critically examines the latest innovations in tabular DL with a dual focus on performance and computational efficiency, and proposes a new approach calledamba-tabular, which combines language model-based and reinforcement learning approaches.

Abstract

Recent advancements in tabular deep learning (DL) have led to substantial performance improvements, surpassing the capabilities of traditional models. With the adoption of techniques from natural language processing (NLP), such as language model-based approaches, DL models for tabular data have also grown in complexity and size. Although tabular datasets do not typically pose scalability issues, the escalating size of these models has raised efficiency concerns. Despite its importance, efficiency has been relatively underexplored in tabular DL research. This paper critically examines the latest innovations in tabular DL, with a dual focus on performance and computational efficiency. The source code is available at https://github.com/basf/mamba-tabular.

On the Efficiency of NLP-Inspired Methods for Tabular Deep Learning

TL;DR

This paper critically examines the latest innovations in tabular DL with a dual focus on performance and computational efficiency, and proposes a new approach calledamba-tabular, which combines language model-based and reinforcement learning approaches.

Abstract

Recent advancements in tabular deep learning (DL) have led to substantial performance improvements, surpassing the capabilities of traditional models. With the adoption of techniques from natural language processing (NLP), such as language model-based approaches, DL models for tabular data have also grown in complexity and size. Although tabular datasets do not typically pose scalability issues, the escalating size of these models has raised efficiency concerns. Despite its importance, efficiency has been relatively underexplored in tabular DL research. This paper critically examines the latest innovations in tabular DL, with a dual focus on performance and computational efficiency. The source code is available at https://github.com/basf/mamba-tabular.

Paper Structure

This paper contains 10 sections, 4 equations, 4 figures, 2 tables.

Figures (4)

  • Figure 1: Memory consumption as a function of the number of features. Left shows GPU memory usage for datasets with $<$100 features/variables. MambAttention and Mambular show the highest memory usage, increasing linearly with the number of features. Mambular-Triton's GPU memory consumption is significantly lower than base Mambular. (Right) For a large number of features, FT-Transformer's memory usage increases quadratically, approaching Mambular and MambAttention at around 400 features.
  • Figure 2: Memory consumption as a function of embedding dimension for a fixed set of 12 features. A pure pytorch Mambular's memory consumption increases significantly faster than that of the FT-Transformer.
  • Figure 3: Average rank performance vs GPU memory usage. Circle size represents the total computation time for a forward pass on a batch of 32 with 10 numerical and 10 categorical features and an embedding size of 64.
  • Figure 4: Memory consumption (left panel) and GPU time taken (right panel) during a backward pass as a function of the number of features.