Table of Contents
Fetching ...

Hybrid Autoencoders for Tabular Data: Leveraging Model-Based Augmentation in Low-Label Settings

Erel Naor, Ofir Lindenbaum

TL;DR

TANDEM addresses the challenge of learning from limited labeled data on tabular sources by introducing a hybrid self-supervised autoencoder that couples a neural encoder with an Oblivious Soft Decision Tree (OSDT) encoder. Each encoder receives a sample-specific masked view produced by its own gating network, and both decode through a shared decoder with cross-view reconstruction and latent-space alignment losses guiding joint training. Inference relies solely on the neural encoder, preserving SSL compatibility, while spectral analysis reveals complementary inductive biases: the neural path emphasizes smoothing high-frequency content less suited to tabular structure, whereas the OSDT path captures sharp, localized patterns. Empirically, TANDEM achieves state-of-the-art performance on low-label classification and regression across diverse tabular datasets, with ablations confirming the necessity of both encoders and gating, and frequency-decomposition analyses highlighting the model-based augmentation advantages.

Abstract

Deep neural networks often under-perform on tabular data due to their sensitivity to irrelevant features and a spectral bias toward smooth, low-frequency functions. These limitations hinder their ability to capture the sharp, high-frequency signals that often define tabular structure, especially under limited labeled samples. While self-supervised learning (SSL) offers promise in such settings, it remains challenging in tabular domains due to the lack of effective data augmentations. We propose a hybrid autoencoder that combines a neural encoder with an oblivious soft decision tree (OSDT) encoder, each guided by its own stochastic gating network that performs sample-specific feature selection. Together, these structurally different encoders and model-specific gating networks implement model-based augmentation, producing complementary input views tailored to each architecture. The two encoders, trained with a shared decoder and cross-reconstruction loss, learn distinct yet aligned representations that reflect their respective inductive biases. During training, the OSDT encoder (robust to noise and effective at modeling localized, high-frequency structure) guides the neural encoder toward representations more aligned with tabular data. At inference, only the neural encoder is used, preserving flexibility and SSL compatibility. Spectral analysis highlights the distinct inductive biases of each encoder. Our method achieves consistent gains in low-label classification and regression across diverse tabular datasets, outperforming deep and tree-based supervised baselines.

Hybrid Autoencoders for Tabular Data: Leveraging Model-Based Augmentation in Low-Label Settings

TL;DR

TANDEM addresses the challenge of learning from limited labeled data on tabular sources by introducing a hybrid self-supervised autoencoder that couples a neural encoder with an Oblivious Soft Decision Tree (OSDT) encoder. Each encoder receives a sample-specific masked view produced by its own gating network, and both decode through a shared decoder with cross-view reconstruction and latent-space alignment losses guiding joint training. Inference relies solely on the neural encoder, preserving SSL compatibility, while spectral analysis reveals complementary inductive biases: the neural path emphasizes smoothing high-frequency content less suited to tabular structure, whereas the OSDT path captures sharp, localized patterns. Empirically, TANDEM achieves state-of-the-art performance on low-label classification and regression across diverse tabular datasets, with ablations confirming the necessity of both encoders and gating, and frequency-decomposition analyses highlighting the model-based augmentation advantages.

Abstract

Deep neural networks often under-perform on tabular data due to their sensitivity to irrelevant features and a spectral bias toward smooth, low-frequency functions. These limitations hinder their ability to capture the sharp, high-frequency signals that often define tabular structure, especially under limited labeled samples. While self-supervised learning (SSL) offers promise in such settings, it remains challenging in tabular domains due to the lack of effective data augmentations. We propose a hybrid autoencoder that combines a neural encoder with an oblivious soft decision tree (OSDT) encoder, each guided by its own stochastic gating network that performs sample-specific feature selection. Together, these structurally different encoders and model-specific gating networks implement model-based augmentation, producing complementary input views tailored to each architecture. The two encoders, trained with a shared decoder and cross-reconstruction loss, learn distinct yet aligned representations that reflect their respective inductive biases. During training, the OSDT encoder (robust to noise and effective at modeling localized, high-frequency structure) guides the neural encoder toward representations more aligned with tabular data. At inference, only the neural encoder is used, preserving flexibility and SSL compatibility. Spectral analysis highlights the distinct inductive biases of each encoder. Our method achieves consistent gains in low-label classification and regression across diverse tabular datasets, outperforming deep and tree-based supervised baselines.

Paper Structure

This paper contains 36 sections, 9 equations, 10 figures, 19 tables.

Figures (10)

  • Figure 1: Overview of the TANDEM architecture. Input $X$ is augmented via two distinct stochastic gating networks, producing separate masked views for a neural encoder and an OSDT encoder, each of which is illustrated to the right to reflect their respective inductive biases. TANDEM is trained using reconstruction loss, alignment loss, and latent representation similarity (LRS) loss. During inference, only the neural encoder and the MLP classifier are used to predict the output label $\hat{y}$, based on the view gated by the neural encoder's gating net.
  • Figure 2: Architecture of the OSDT encoder in TANDEM. At each depth $\ell \in \{1, \dots, L\}$, the input $x$ is gated by a dedicated gating network $g_\ell$, producing a masked vector $\tilde{x}_\ell = x \odot g_\ell$. This masked input is projected by a learnable vector $w_\ell$ and compared against a threshold $\tau_\ell$, producing a soft binary decision. Probabilities propagate through the tree to define $p_{\mathbf{b}}$, the soft path probability to leaf $R_{\mathbf{b}}$. The final output of the encoder is the concatenation of all leaf probabilities, $Z^{\text{OSDT}} \in \mathbb{R}^{2^L}$, serving as a disentangled latent representation.
  • Figure 3: Classification accuracy distribution across models. Boxplot across baseline models and TANDEM; red lines denote the mean and black lines denote the median.
  • Figure 4: Classification Dolan–Moré profiles. Model accuracy relative to the per-dataset best; higher curves indicate stronger performance across datasets.
  • Figure 5: Regression MSE distribution across models. Boxplot across baseline models and TANDEM; red lines denote the mean and black lines denote the median.
  • ...and 5 more figures