GANDALF: Gated Adaptive Network for Deep Automated Learning of Features
Manu Joseph, Harsh Raj
TL;DR
GANDALF targets the gap between deep learning and gradient-boosted methods on tabular data by introducing Gated Feature Learning Units (GFLUs) with learnable feature masks and a gating mechanism. The architecture stacks multiple GFLUs to build a hierarchical feature representation, which is then passed through a lightweight MLP for prediction, achieving strong accuracy with fewer parameters and reduced compute. The work provides extensive public-benchmark validation (Tabular Benchmark and TabSurvey) and offers interpretability through aggregated feature masks and fidelity analyses with GradientSHAP and DeepLIFT. The authors also share an open-source PyTorch Tabular implementation under MIT, enabling practical adoption and further research on tabular deep learning models.
Abstract
We propose a novel high-performance, interpretable, and parameter \& computationally efficient deep learning architecture for tabular data, Gated Adaptive Network for Deep Automated Learning of Features (GANDALF). GANDALF relies on a new tabular processing unit with a gating mechanism and in-built feature selection called Gated Feature Learning Unit (GFLU) as a feature representation learning unit. We demonstrate that GANDALF outperforms or stays at-par with SOTA approaches like XGBoost, SAINT, FT-Transformers, etc. by experiments on multiple established public benchmarks. We have made available the code at github.com/manujosephv/pytorch_tabular under MIT License.
