Table of Contents
Fetching ...

Unsupervised Feature Selection Through Group Discovery

Shira Lifshitz, Ofir Lindenbaum, Gal Mishne, Ron Meir, Hadas Benisty

TL;DR

GroupFS tackles unsupervised feature selection when signals aggregate in latent feature groups. It jointly learns group structure and selects informative groups through an end-to-end differentiable framework that enforces Laplacian smoothness on both sample and feature graphs, and applies a group-sparsity regularizer. By leveraging a Gumbel-Softmax derived assignment and stochastic gates, GroupFS discovers latent groups without supervision and achieves competitive or superior clustering accuracy across nine benchmarks, with interpretable, domain-aligned groupings. Limitations include reliance on Euclidean distances for graph construction and a single global notion of group importance; future work includes manifold-aware distances and time- or condition-adaptive grouping.

Abstract

Unsupervised feature selection (FS) is essential for high-dimensional learning tasks where labels are not available. It helps reduce noise, improve generalization, and enhance interpretability. However, most existing unsupervised FS methods evaluate features in isolation, even though informative signals often emerge from groups of related features. For example, adjacent pixels, functionally connected brain regions, or correlated financial indicators tend to act together, making independent evaluation suboptimal. Although some methods attempt to capture group structure, they typically rely on predefined partitions or label supervision, limiting their applicability. We propose GroupFS, an end-to-end, fully differentiable framework that jointly discovers latent feature groups and selects the most informative groups among them, without relying on fixed a priori groups or label supervision. GroupFS enforces Laplacian smoothness on both feature and sample graphs and applies a group sparsity regularizer to learn a compact, structured representation. Across nine benchmarks spanning images, tabular data, and biological datasets, GroupFS consistently outperforms state-of-the-art unsupervised FS in clustering and selects groups of features that align with meaningful patterns.

Unsupervised Feature Selection Through Group Discovery

TL;DR

GroupFS tackles unsupervised feature selection when signals aggregate in latent feature groups. It jointly learns group structure and selects informative groups through an end-to-end differentiable framework that enforces Laplacian smoothness on both sample and feature graphs, and applies a group-sparsity regularizer. By leveraging a Gumbel-Softmax derived assignment and stochastic gates, GroupFS discovers latent groups without supervision and achieves competitive or superior clustering accuracy across nine benchmarks, with interpretable, domain-aligned groupings. Limitations include reliance on Euclidean distances for graph construction and a single global notion of group importance; future work includes manifold-aware distances and time- or condition-adaptive grouping.

Abstract

Unsupervised feature selection (FS) is essential for high-dimensional learning tasks where labels are not available. It helps reduce noise, improve generalization, and enhance interpretability. However, most existing unsupervised FS methods evaluate features in isolation, even though informative signals often emerge from groups of related features. For example, adjacent pixels, functionally connected brain regions, or correlated financial indicators tend to act together, making independent evaluation suboptimal. Although some methods attempt to capture group structure, they typically rely on predefined partitions or label supervision, limiting their applicability. We propose GroupFS, an end-to-end, fully differentiable framework that jointly discovers latent feature groups and selects the most informative groups among them, without relying on fixed a priori groups or label supervision. GroupFS enforces Laplacian smoothness on both feature and sample graphs and applies a group sparsity regularizer to learn a compact, structured representation. Across nine benchmarks spanning images, tabular data, and biological datasets, GroupFS consistently outperforms state-of-the-art unsupervised FS in clustering and selects groups of features that align with meaningful patterns.

Paper Structure

This paper contains 44 sections, 16 equations, 5 figures, 9 tables.

Figures (5)

  • Figure 1: Illustration: GroupFS learns feature-to-group associations, enforces smoothness on the feature graph, infers the importance of each group, and reconstructs a smoother sample-similarity graph.
  • Figure 2: Two-moons synthetic data. (A) 2D visualization of the dataset under low and high Gaussian noise levels ($\text{STD} = 0.05$ and $\text{STD} = 0.45$). (B) Feature correlation matrices ($20 \times 20$, lower triangle) with two levels of correlation strength ($\rho = 1.00$ and $\rho = 0.60$). (C) Final training loss as a function of correlation strength $\rho$, showing lower loss for stronger correlations. (D) Final training loss as a function of noise standard deviation, showing robustness to moderate sample-level noise. Results in (C,D) are averaged over 10 random seeds; error bars denote standard error.
  • Figure 3: Two-moons: Effect of feature dimension $d$ and group count $C$. Mean $RG_{\text{sim}}$ TPR and FDR of the best-loss model over 10 random seeds. Complementary std results are in App. \ref{['app:exp_synthetic']}.
  • Figure 4: GroupFS on NMNIST (3 vs. 8). (A) Pixel groups discovered by GroupFS, colored by group ID and ranked by importance (1 = highest, 7 = lowest). (B) The top two groups align with class-relevant regions. (C) Noisy image examples of digits '8' and '3'.
  • Figure 5: Two-moons run-to-run variability: Effect of feature dimension $d$ and group count $C$. Standard deviation of $RG_{\text{sim}}$, TPR, and FDR for the best-loss model over 10 random seeds.