GFSNetwork: Differentiable Feature Selection via Gumbel-Sigmoid Relaxation

Witold Wydmański; Marek Śmieja

GFSNetwork: Differentiable Feature Selection via Gumbel-Sigmoid Relaxation

Witold Wydmański, Marek Śmieja

TL;DR

GFSNetwork addresses feature selection for high-dimensional data by learning a differentiable mask via temperature-controlled Gumbel-Sigmoid sampling. The model splits into a masking network that produces a GS-based mask and a task network that learns from masked inputs, optimizing $\mathcal{L}_{total} = \mathcal{L}_{task} + \lambda \mathcal{L}_{select}$ with an annealed temperature $\tau$ to promote sparsity while maintaining performance. It demonstrates competitive accuracy with far fewer features across classification, regression, and metagenomic benchmarks, and provides interpretable feature subsets as shown by MNIST visualizations, all with near-constant computational overhead. Limitations arise with engineered second-order feature interactions, pointing to future work in capturing feature interdependencies while preserving scalability.

Abstract

Feature selection in deep learning remains a critical challenge, particularly for high-dimensional tabular data where interpretability and computational efficiency are paramount. We present GFSNetwork, a novel neural architecture that performs differentiable feature selection through temperature-controlled Gumbel-Sigmoid sampling. Unlike traditional methods, where the user has to define the requested number of features, GFSNetwork selects it automatically during an end-to-end process. Moreover, GFSNetwork maintains constant computational overhead regardless of the number of input features. We evaluate GFSNetwork on a series of classification and regression benchmarks, where it consistently outperforms recent methods including DeepLasso, attention maps, as well as traditional feature selectors, while using significantly fewer features. Furthermore, we validate our approach on real-world metagenomic datasets, demonstrating its effectiveness in high-dimensional biological data. Concluding, our method provides a scalable solution that bridges the gap between neural network flexibility and traditional feature selection interpretability. We share our python implementation of GFSNetwork at https://github.com/wwydmanski/GFSNetwork, as well as a PyPi package (gfs_network).

GFSNetwork: Differentiable Feature Selection via Gumbel-Sigmoid Relaxation

TL;DR

Abstract

GFSNetwork: Differentiable Feature Selection via Gumbel-Sigmoid Relaxation

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (5)