GCondNet: A Novel Method for Improving Neural Networks on Small High-Dimensional Tabular Data
Andrei Margeloiu, Nikola Simidjievski, Pietro Lio, Mateja Jamnik
TL;DR
GCondNet tackles the challenge of learning from small, high-dimensional tabular data by introducing sample-wise multiplex graphs, one per feature, whose latent structure is learned by a Graph Neural Network that conditions the first-layer weights of a predictor network. The first-layer weights are formed as $W^{[1]}_{MLP}=\alpha W_{GNN}+(1-\alpha)W_{scratch}$ with $\alpha$ decaying from $1$ to $0$ over $n_{\alpha}$ steps, enabling a gradual shift from graph-informed initialization to autonomous learning. Graphs are constructed per feature using simple distance-based schemes (KNN with $k=5$ or Sparse Relative Distance) and are used only during training, ensuring test-time prediction relies solely on the trained predictor. Empirically, GCondNet outperforms 14 baselines across 12 biomedical datasets, demonstrates robustness to graph-construction choices, and extends to other architectures like TabTransformer, highlighting its generality and potential as a regularisation mechanism for small-sample, high-dimensional tabular tasks.
Abstract
Neural networks often struggle with high-dimensional but small sample-size tabular datasets. One reason is that current weight initialisation methods assume independence between weights, which can be problematic when there are insufficient samples to estimate the model's parameters accurately. In such small data scenarios, leveraging additional structures can improve the model's performance and training stability. To address this, we propose GCondNet, a general approach to enhance neural networks by leveraging implicit structures present in tabular data. We create a graph between samples for each data dimension, and utilise Graph Neural Networks (GNNs) to extract this implicit structure, and for conditioning the parameters of the first layer of an underlying predictor network. By creating many small graphs, GCondNet exploits the data's high-dimensionality, and thus improves the performance of an underlying predictor network. We demonstrate GCondNet's effectiveness on 12 real-world datasets, where it outperforms 14 standard and state-of-the-art methods. The results show that GCondNet is a versatile framework for injecting graph-regularisation into various types of neural networks, including MLPs and tabular Transformers. Code is available at https://github.com/andreimargeloiu/GCondNet.
